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Preface 


This book is a compact introduction to many of the important topics of 
mathematical logic, comprising natural and unrestricted set-theoretic 
methods. Here is a very brief sketch of some of its contents: 


1. One of the most prominent features of this new edition is a con- 
sistency proof for formal number theory due to Kurt Schiitte. This 
proof had been included in the first edition in 1964. It was dropped 
in later editions and is now brought back by “popular demand.” 
Quite a few people thought I had made a mistake in abandoning it. 


2. There is now a greatly enlarged bibliography, with items that should 
be interesting to a wide audience. Many of them have to do with 
the philosophical significance of some important results of modern 
mathematical logic. 


As before, the material in this book can be covered in two semesters, 
but Chapters 1 through 3 are quite adequate for a one-semester course. 
Bibliographic references are aimed at giving the best source of information, 
which is not always the earliest; hence, these references give no indication of 
priority. 

I believe that the essential parts of the book can be read with ease by any- 
one with some experience in abstract mathematical thinking. There is, how- 
ever, no specific prerequisite. 

This book owes an obvious debt to the standard works of Hilbert and 
Bernays (1934, 1939), Kleene (1952), Rosser (1953), and Church (1956). Iam also 
grateful to many people for their help, including my editor Jessica Vakili, as 
well as the editors of the earlier editions. 
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Introduction 


One of the popular definitions of logic is that it is the analysis of methods of 
reasoning. In studying these methods, logic is interested in the form rather 
than the content of the argument. For example, consider these two arguments: 


1. All men are mortal. Socrates is a man. Hence, Socrates is mortal. 
2. All cats like fish. Silvy is a cat. Hence, Silvy likes fish. 


Both have the same form: All A are B. S is an A. Hence, S is a B. The truth or 
falsity of the particular premises and conclusions is of no concern to logi- 
cians. They want to know only whether the premises imply the conclusion. 
The systematic formalization and cataloguing of valid methods of reasoning 
are a main task of logicians. If the work uses mathematical techniques or if it 
is primarily devoted to the study of mathematical reasoning, then it may be 
called mathematical logic. We can narrow the domain of mathematical logic if 
we define its principal aim to be a precise and adequate understanding of the 
notion of mathematical proof. 

Impeccable definitions have little value at the beginning of the study of a 
subject. The best way to find out what mathematical logic is about is to start 
doing it, and students are advised to begin reading the book even though 
(or especially if) they have qualms about the meaning and purpose of the 
subject. 

Although logic is basic to all other studies, its fundamental and apparently 
self-evident character discouraged any deep logical investigations until the 
late nineteenth century. Then, under the impetus of the discovery of non- 
Euclidean geometry and the desire to provide a rigorous foundation for 
calculus and higher analysis, interest in logic was revived. This new inter- 
est, however, was still rather unenthusiastic until, around the turn of the 
century, the mathematical world was shocked by the discovery of the para- 
doxes—that is, arguments that lead to contradictions. The most important 
paradoxes are described here. 


1. Russell's paradox (1902): By a set, we mean any collection of objects— 
for example, the set of all even integers or the set of all saxophone 
players in Brooklyn. The objects that make up a set are called its 
members or elements. Sets may themselves be members of sets; for 
example, the set of all sets of integers has sets as its members. Most 
sets are not members of themselves; the set of cats, for example, is not 
a member of itself because the set of cats is not a cat. However, there 
may be sets that do belong to themselves—perhaps, for example, 
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a set containing all sets. Now, consider the set A of all those sets X 
such that X is not a member of X. Clearly, by definition, A is a mem- 
ber of A if and only if A is not a member of A. So, if A is a member of 
A, then A is also not a member of A; and if A is not a member of A, 
then A is amember of A. In any case, A is amember of A and A is not 
a member of A (see Link, 2004). 


2. Cantor’s paradox (1899): This paradox involves the theory of cardinal 
numbers and may be skipped by those readers having no previous 
acquaintance with that theory. The cardinal number y of a set Y is a 
measure of the size of the set; Y = 7 if and only if Y is equinumerous 
with Z (ie., there is a one-one correspondence between Y and Z). 
We define Y< Z to mean that Y is equinumerous with a subset of 
Z; by Y <Z we mean Y<Z and y z Z. Cantor proved that if .(Y) 
is the set of all subsets of Y, then Y < /(Y). Let V be the universal 
set—that is, the set of all sets. Now, .7(V) is a subset of V; so it fol- 
lows easily that (V)<V. On the other hand, by Cantor’s theorem, 
V <./(V). Bernstein’s theorem asserts that if Y< Z and 7<Y, then 
Y =Z. Hence, V = ./(V), contradicting V < /(V). 

3. Burali-Forti’s paradox (1897): This paradox is the analogue in the the- 
ory of ordinal numbers of Cantor’s paradox and requires familiarity 
with ordinal number theory. Given any ordinal number, there is still 
a larger ordinal number. But the ordinal number determined by the 
set of all ordinal numbers is the largest ordinal number. 


4. The liar paradox: Aman says, “Iam lying.” If he is lying, then what he 
says is true and so he is not lying. If he is not lying, then what he says 
is true, and so he is lying. In any case, he is lying and he is not lying.* 

5. Richard's paradox (1905): Some phrases of the English language denote 
real numbers; for example, “the ratio between the circumference and 
diameter of a circle” denotes the number a. All the phrases of the 
English language can be enumerated in a standard way: order all 
phrases that have k letters lexicographically (as in a dictionary) and 
then place all phrases with k letters before all phrases with a larger 
number of letters. Hence, all phrases of the English language that 
denote real numbers can be enumerated merely by omitting all other 
phrases in the given standard enumeration. Call the nth real number 
in this enumeration the nth Richard number. Consider the phrase: 
“the real number whose nth decimal place is 1 if the nth decimal 


* The Cretan “paradox,” known in antiquity, is similar to the liar paradox. The Cretan philoso- 
pher Epimenides said, “All Cretans are liars.” If what he said is true, then, since Epimenides 
is a Cretan, it must be false. Hence, what he said is false. Thus, there must be some Cretan 
who is not a liar. This is not logically impossible; so we do not have a genuine paradox. 
However, the fact that the utterance by Epimenides of that false sentence could imply the 
existence of some Cretan who is not a liar is rather unsettling. 


Introduction xvii 


place of the nth Richard number is not 1, and whose nth decimal 
place is 2 if the nth decimal place of the nth Richard number is 1.” 
This phrase defines a Richard number—say, the kth Richard num- 
ber; but, by its definition, it differs from the kth Richard number in 
the kth decimal place. 


6. Berry’s paradox (1906): There are only a finite number of symbols (let- 
ters, punctuation signs, etc.) in the English language. Hence, there 
are only a finite number of English expressions that contain fewer 
than 200 occurrences of symbols (allowing repetitions). There are, 
therefore, only a finite number of positive integers that are denoted 
by an English expression containing fewer than 200 occurrences 
of symbols. Let k be the least positive integer that is not denoted by an 
English expression containing fewer than 200 occurrences of symbols. The 
italicized English phrase contains fewer than 200 occurrences of 
symbols and denotes the integer k. 


7. Grelling’s paradox (1908): An adjective is called autological if the prop- 
erty denoted by the adjective holds for the adjective itself. An adjec- 
tive is called heterological if the property denoted by the adjective 
does not apply to the adjective itself. For example, “polysyllabic” and 
“English” are autological, whereas “monosyllabic” and “French” are 
heterological. Consider the adjective “heterological.” If “heterologi- 
cal” is heterological, then it is not heterological. If “heterological” is 
not heterological, then it is heterological. In either case, “heterologi- 
cal” is both heterological and not heterological. 


8. Léb’s paradox (1955): Let A be any sentence. Let B be the sentence: “If 
this sentence is true, then A.” So B asserts, “If B is true, then A.” Now 
consider the following argument: Assume B is true; then, by B, since 
B is true, A holds. This argument shows that if B is true, then A. But 
this is exactly what B asserts. Hence, B is true. Therefore, by B, since 
B is true, A is true. Thus, every sentence is true. (This paradox may 
be more accurately attributed to Curry [1942].) 


All of these paradoxes are genuine in the sense that they contain no obvi- 
ous logical flaws. The logical paradoxes (1-3) involve only notions from the 
theory of sets, whereas the semantic paradoxes (4-8) also make use of con- 
cepts like “denote,” “true,” and “adjective,” which need not occur within our 
standard mathematical language. For this reason, the logical paradoxes are 
a much greater threat to a mathematician’s peace of mind than the semantic 
paradoxes. 

Analysis of the paradoxes has led to various proposals for avoiding them. 
All of these proposals are restrictive in one way or another of the “naive” 
concepts that enter into the derivation of the paradoxes. Russell noted the 
self-reference present in all the paradoxes and suggested that every object 
must have a definite nonnegative integer as its “type.” Then an expression 
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“x is a member of the set y” is to be considered meaningful if and only if the 
type of y is one greater than the type of x. 

This approach, known as the theory of types and systematized and devel- 
oped in Principia Mathematica by Whitehead and Russell (1910-1913), is suc- 
cessful in eliminating the known paradoxes,* but it is clumsy in practice and 
has certain other drawbacks as well. A different criticism of the logical para- 
doxes is aimed at their assumption that, for every property P(x), there exists 
a corresponding set of all objects x that satisfy P(x). If we reject this assump- 
tion, then the logical paradoxes are no longer derivable.’ It is necessary, how- 
ever, to provide new postulates that will enable us to prove the existence of 
those sets that are needed by the practicing mathematician. The first such 
axiomatic set theory was invented by Zermelo (1908). In Chapter 4, we shall 
present an axiomatic theory of sets that is a descendant of Zermelo’s system 
(with some new twists given to it by von Neumann, R. Robinson, Bernays, 
and Gédel). There are also various hybrid theories combining some aspects 
of type theory and axiomatic set theory—for example, Quine’s system NF. 

A more radical interpretation of the paradoxes has been advocated by 
Brouwer and his intuitionist school (see Heyting, 1956). They refuse to accept 
the universality of certain basic logical laws, such as the law of excluded 
middle: P or not P. Such a law, they claim, is true for finite sets, but it is 
invalid to extend it on a wholesale basis to all sets. Likewise, they say it is 
invalid to conclude that “There exists an object x such that not-P(x)” follows 
from the negation of “For all x, P(x)’; we are justified in asserting the exis- 
tence of an object having a certain property only if we know an effective 
method for constructing (or finding) such an object. The paradoxes are not 
derivable (or even meaningful) if we obey the intuitionist strictures, but so 
are many important theorems of everyday mathematics, and for this reason, 
intuitionism has found few converts among mathematicians. 


Exercises 


P.1 Use the sentence 
(*) This entire sentence is false or 2 + 2 =5 to prove that 2 + 2=5. Comment 
on the significance of this proof. 


P.2_ Show how the following has a paradoxical result. 
The smallest positive integer that is not denoted by a phrase in this 
book. 


* Russells’s paradox, for example, depends on the existence of the set A of all sets that are not 
members of themselves. Because, according to the theory of types, it is meaningless to say 
that a set belongs to itself, there is no such set A. 

* Russell’s paradox then proves that there is no set A of all sets that do not belong to them- 
selves. The paradoxes of Cantor and Burali-Forti show that there is no universal set and no 
set that contains all ordinal numbers. The semantic paradoxes cannot even be formulated, 
since they involve notions not expressible within the system. 
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Whatever approach one takes to the paradoxes, it is necessary first to 
examine the language of logic and mathematics to see what symbols may be 
used, to determine the ways in which these symbols are put together to form 
terms, formulas, sentences, and proofs and to find out what can and cannot 
be proved if certain axioms and rules of inference are assumed. This is one of 
the tasks of mathematical logic, and until it is done, there is no basis for com- 
paring rival foundations of logic and mathematics. The deep and devastat- 
ing results of Gédel, Tarski, Church, Rosser, Kleene, and many others have 
been ample reward for the labor invested and have earned for mathematical 
logic its status as an independent branch of mathematics. 

For the absolute novice, a summary will be given here of some of the basic 
notations, ideas, and results used in the text. The reader is urged to skip 
these explanations now and, if necessary, to refer to them later on. 

A set is a collection of objects.* The objects in the collection are called 
elements or members of the set. We shall write “x € y” for the statement that 
x is a member of y. (Gynonymous expressions are “x belongs to y” and “y 
contains x.”) The negation of “x € y” will be written “x ¢ y.” 

By “x C y” we mean that every member of x is also a member of y (synony- 
mously, that x is a subset of y or that x is included in y). We shall write “t = s” to 
mean that t and s denote the same object. As usual, “t # s” is the negation of 
“t=s.” For sets x and y, we assume that x = y if and only if x C y and y € x—that 
is, if and only if x and y have the same members. A set x is called a proper 
subset of a set y, written “x C y” if x Cy but x ¥ y. (The notation x € y is often 
used instead of x C y.) 

The union x U y of sets x and y is defined to be the set of all objects that are 
members of x or y or both. Hence, x Ux =x,xUy=yUx,and(xUy)UZ= 
x U (y UZ). The intersection x N y is the set of objects that x and y have in com- 
mon. Therefore, x Nx =x,xnNy=ynx,and (xn y)Nz=xN (yz). Moreover, 
xN(YURA=aaHNyYUHNZandxU(yNzZ) =(xU y) N (x U Z). The relative 
complement x — y is the set of members of x that are not members of y. We also 
postulate the existence of the empty set (or null set) @—that is, a set that has no 
members at all. Thenxn @=@,x U@W=x,x-O©=x,G@-xX=@G, andx-xXx=@. 
Sets x and y are called disjoint if xn y = ©. 

Given any objects b,, ..., bj, the set that contains by, ..., b, as its only mem- 
bers is denoted {b,, ..., b,. In particular, {x, y} is a set having x and y as its only 
members and, if x # y, is called the unordered pair of x and y. The set {x, x} 
is identical with {x} and is called the unit set of x. Notice that {x, y} = {y, x}. 
By (b,, ..., b,) we mean the ordered k-tuple of b,, ..., b,. The basic property of 
ordered k-tuples is that (b,, ..., b,) = (cy ..., cy if and only if b, = cy, by = cy, ..., 
b, = c. Thus, (b1, bz) = (bz, b;) if and only if b, = b,. Ordered 2-tuples are called 


* Which collections of objects form sets will not be specified here. Care will be exercised to 
avoid using any ideas or procedures that may lead to the paradoxes; all the results can be 
formalized in the axiomatic set theory of Chapter 4. The term “class” is sometimes used as a 
synonym for “set,” but it will be avoided here because it has a different meaning in Chapter 4. 
If a property P(x) does determine a set, that set is often denoted {x|P(x)}. 


xx Introduction 


ordered pairs. The ordered 1-tuple (b) is taken to be D itself. If X is a set and k 
is a positive integer, we denote by X* the set of all ordered k-tuples (b,, ..., b,) 
of elements b,, ..., b, of X. In particular, X! is X itself. If Y and Z are sets, then 
by Y x Z we denote the set of all ordered pairs (y, z) such that y € Y and z € Z. 
Y x Zis called the Cartesian product of Y and Z. 

An n-place relation (or a relation with n arguments) on a set X is a subset 
of X"—that is, a set of ordered n-tuples of elements of X. For example, the 
3-place relation of betweenness for points on a line is the set of all 3-tuples 
{x, y, z) such that the point x lies between the points y and z. A 2-place relation 
is called a binary relation; for example, the binary relation of fatherhood on 
the set of human beings is the set of all ordered pairs (x, y) such that x and y 
are human beings and x is the father of y. A 1-place relation on X is a subset 
of X and is called a property on X. 

Given a binary relation R on a set X, the domain of R is defined to be the set 
of all y such that (y, z) € R for some z; the range of R is the set of all z such that 
{y, Z) € R for some y; and the field of R is the union of the domain and range 
of R. The inverse relation R™ of R is the set of all ordered pairs (y, z) such that 
{z, Y) € R. For example, the domain of the relation < on the set w of nonnega- 
tive integers” is w, its range is w — {0}, and the inverse of < is >. Notation: Very 
often xRy is written instead of (x, y) € R. Thus, in the example just given, we 
usually write x < y instead of (x, y) € <. 

A binary relation R is said to be reflexive if xRx for all x in the field of R; R 
is symmetric if xRy implies yRx; and R is transitive if xRy and yRz imply xRz. 
The following are examples: The relation < on the set of integers is reflexive 
and transitive but not symmetric. The relation “having at least one parent 
in common” on the set of human beings is reflexive and symmetric, but not 
transitive. 

A binary relation that is reflexive, symmetric, and transitive is called an 
equivalence relation. Examples of equivalence relations are (1) the identity rela- 
tion Ix ona set X, consisting of all pairs (x, x), where x € X; (2) given a fixed 
positive integer n, the relation x = y (mod n), which holds when x and y are 
integers and x — y is divisible by n; (3) the congruence relation on the set of 
triangles in a plane; and (4) the similarity relation on the set of triangles in 
a plane. Given an equivalence relation R whose field is X, and given any 
y € X, define [y] as the set of all z in X such that yRz. Then [y] is called 
the R-equivalence class of y. Clearly, [u] = [v] if and only if uRv. Moreover, if 
[u] # [v], then [u] /n [v] = @; that is, different R-equivalence classes have no 
elements in common. Hence, the set X is completely partitioned into the 
R-equivalence classes. In example (1) earlier, the equivalence classes are just 
the unit sets {x}, where x € X. In example (2), there are n equivalence classes, 
the kth equivalence class (k = 0, 1, ...,n — 1) being the set of all integers that 
leave the remainder k upon division by n. 


* w will also be referred to as the set of natural numbers. 
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A function fis a binary relation such that (x, y ) € f and (x, z) € fimply y = z. 
Thus, for any element x of the domain of a function f, there is a unique y such 
that (x, y) € f; this unique y is denoted f(x). If x is in the domain of f, then f(x) 
is said to be defined. A function f with domain X and range Y is said to be a 
function from X onto Y. If fis a function from X onto a subset of Z, then f is 
said to be a function from X into Z. For example, if the domain of fis the set 
of integers and f(x) = 2x for every integer x, then fis a function from the set of 
integers onto the set of even integers, and f is a function from the set of inte- 
gers into the set of integers. A function whose domain consists of n-tuples is 
said to be a function of n arguments. A total function of n arguments on a set X is 
a function f whose domain is X". It is customary to write f(x, ..., x,,) instead 
of f(xy, ..., X,)), and we refer to f(x, ..., x,) as the value of f for the arguments 
Xy, ..., X, A partial function of n arguments on a set X is a function whose 
domain is a subset of X”. For example, ordinary division is a partial, but not 
total, function of two arguments on the set of integers, since division by 0 is 
not defined. If fis a function with domain X and range Y, then the restriction 
f, of f to a set Z is the function fn (Z x Y). Then f,(u) = v if and only if u € Z 
and f(u) = v. The image of the set Z under the function f is the range of f,. The 
inverse image of a set W under the function f is the set of all u in the domain 
of f such that f(u) € W. We say that f maps X onto (into) Y if X is a subset of the 
domain of f and the image of X under fis (a subset of) Y. By an n-place opera- 
tion (or operation with n arguments) on a set X we mean a function from X” 
into X. For example, ordinary addition is a binary (i.e., 2-place) operation 
on the set of natural numbers {0, 1, 2, ...}. But ordinary subtraction is not a 
binary operation on the set of natural numbers. 

The composition f o g (sometimes denoted fg) of functions f and g is the 
function such that (f 0 g)(x) = f(g); (f © g)() is defined if and only if g(x) 
is defined and f(g(x)) is defined. For example, if g(x) = x? and f(x) = x + 1 for 
every integer x, then (fo g)(x) =x? + Land (g o f)(x) = (x + 1)?. Also, if h(x) = -x 
for every real number x and f(x) =~x for every nonnegative real number x, 
then (fo h)(x) is defined only for x < 0, and, for such x,(feh)(x) = J—x. A func- 
tion f such that f(x) = f(y) implies x = y is called a 1-1 (one-one) function. For 
example, the identity relation I, on a set X is a 1-1 function, since I,(y) = y for 
every y € X; the function g with domain @, such that g(x) = 2x for every x € @, 
is 1-1 (one-one); but the function i whose domain is the set of integers and 
such that h(x) = x? for every integer x is not 1-1, since h(-1) = h(1). Notice that 
a function f is 1-1 if and only if its inverse relation f~! is a function. If the 
domain and range of a 1-1 function f are X and Y, then f is said to be a 1-1 
correspondence between X and Y; then f~! is a 1-1 correspondence between 
Y and X, and (f-! o f) = I, and (f o f+) = Iy. If fis a 1-1 correspondence 
between X and Y and g is a 1-1 correspondence between Y and Z, then go f 
is a 1-1 correspondence between X and Z. Sets X and Y are said to be equinu- 
merous (written X = Y) if and only if there is a 1-1 correspondence between 
X and Y. Clearly, X = X, X = Y implies Y = X, and X = Y and Y = Z implies 
X & Z. It is somewhat harder to show that, if X = Y, C Yand Y = X, CX, then 
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X = Y (see Bernstein's theorem in Chapter 4). If X = Y, one says that X and Y 
have the same cardinal number, and if X is equinumerous with a subset of Y but 
Y is not equinumerous with a subset of X, one says that the cardinal number 
of X is smaller than the cardinal number of Y* 

A set X is denumerable if it is equinumerous with the set of positive integers. 
A denumerable set is said to have cardinal number No, and any set equinu- 
merous with the set of all subsets of a denumerable set is said to have the 
cardinal number 2“ (or to have the power of the continuum). A set X is finite 
if it is empty or if it is equinumerous with the set {1, 2, ..., n} of all positive 
integers that are less than or equal to some positive integer n. A set that is 
not finite is said to be infinite. A set is countable if it is either finite or denu- 
merable. Clearly, any subset of a denumerable set is countable. A denumerable 
sequence is a function s whose domain is the set of positive integers; one usu- 
ally writes s,, instead of s(n). A finite sequence is a function whose domain is 
the empty set or {1, 2, ...,} for some positive integer n. 

Let P(x, y1, ..., y, be some relation on the set of nonnegative integers. 
In particular, P may involve only the variable x and thus be a property. If 
PO, y1, ---, ¥) holds, and, if, for every n, P(n, y, ..., ¥,) implies P(n + 1, yy, ..., Ya, 
then P(x, y,, ..., y,) is true for all nonnegative integers x (principle of mathemati- 
cal induction). In applying this principle, one usually proves that, for every n, 
P(n, yy, .., Y) implies P(n + 1, y, ..., y,) by assuming P(n, y,, ..., y, and then 
deducing P(n + 1, y,, ..., y); in the course of this deduction, P(n, 1, ..., ¥) 
is called the inductive hypothesis. If the relation P actually involves variables 
Yp «- Y other than x, then the proof is said to proceed by induction on x. 
A similar induction principle holds for the set of integers greater than some 
fixed integer j. An example is as follows: to prove by mathematical induc- 
tion that the sum of the first n odd integers 1+3+5+ --- + (2n -1)is n’, first 
show that 1 = 1? (i.e., P(1)), and then, that if 1+3+5+ ---+(2n-1)=n’, then 
143454 ++ +(2n-1)+(2n+1)=(n+1)’ (ie. if P(n), then P(n + 1). From the 
principle of mathematical induction, one can prove the principle of complete 
induction: If for every nonnegative integer x the assumption that P(u, 1, ..., Y) 
is true for all u < x implies that P(x, y, ..., y,) holds, then, for all nonnegative 
integers x, P(x, y1, ..., ¥) is true. (Exercise: Show by complete induction that 
every integer greater than 1 is divisible by a prime number.) 

A partial order is a binary relation R such that R is transitive and, for every 
x in the field of R, xRx is false. If R is a partial order, then the relation R’ that 
is the union of R and the set of all ordered pairs (x, x), where x is in the field 
of R, we shall call a reflexive partial order; in the literature, “partial order” is 
used for either partial order or reflexive partial order. Notice that (xRy and 
yRx) is impossible if R is a partial order, whereas (xRy and yRx) implies x = y 
if R is a reflexive partial order. A (reflexive) total order is a (reflexive) partial 


* One can attempt to define the cardinal number of a set X as the collection [X] of all sets equi- 
numerous with X. However, in certain axiomatic set theories, [X] does not exist, whereas in 
others [X] exists but is not a set. 


Introduction Xxiii 


order such that, for any x and y in the field of R, either x = y or xRy or yRx. For 
example, (1) the relation < on the set of integers is a total order, whereas < is 
a reflexive total order; (2) the relation C on the set of all subsets of the set of 
positive integers is a partial order but not a total order, whereas the relation € 
is a reflexive partial order but not a reflexive total order. If B is a subset of the 
field of a binary relation R, then an element y of B is called an R-least element 
of B if yRz for every element z of B different from y. A well-order (or a well- 
ordering relation) is a total order R such that every nonempty subset of the 
field of R has an R-least element. For example, (1) the relation < on the set of 
nonnegative integers is a well-order, (2) the relation < on the set of nonnega- 
tive rational numbers is a total order but not a well-order, and (3) the relation 
< on the set of integers is a total order but not a well-order. Associated with 
every well-order R having field X, there is a complete induction principle: if P is 
a property such that, for any u in X, whenever all z in X such that zRu have 
the property P, then u has the property P, then it follows that all members 
of X have the property P. If the set X is infinite, a proof using this principle 
is called a proof by transfinite induction. One says that a set X can be well- 
ordered if there exists a well-order whose field is X. An assumption that is 
useful in modern mathematics but about the validity of which there has been 
considerable controversy is the well-ordering principle: every set can be well- 
ordered. The well-ordering principle is equivalent (given the usual axioms of 
set theory) to the axiom of choice: for any set X of nonempty pairwise disjoint 
sets, there is a set Y (called a choice set) that contains exactly one element in 
common with each set in X. 

Let B be a nonempty set, fa function from B into B, and g a function from 
B? into B. Write x’ for f(x) and x n y for g(x, y). Then (B, f, g) is called a Boolean 
algebra if B contains at least two elements and the following conditions are 
satisfied: 


1. xny=ynx for all x and y in B. 
2. (xNy)NZ=xXN(yNzZ) for all x, y, z in B. 
3.xNy’ =znz’ if and only ifx ny =x forall x, y,zinB. 


Let x U y stand for (x’ ny’), and write x < y for x nN y = x. It is easily proved 
that zn z’ = w nw’ for any w and z in B; we denote the value of zn z’ by 0. 
Let 1 stand for 0’. Then z U z’ = 1 for all z in B. Note also that < is a reflexive 
partial order on B, and (B, f, U) is a Boolean algebra. (The symbols n, u, 0, 1 
should not be confused with the corresponding symbols used in set theory 
and arithmetic.) An ideal J in (B, f, g) is anonempty subset of B such that (1) if 
xefsandy ej, thenx Uy €J, and (2)ifx eJandy €B, thenxnyeé/J. Clearly, 
{O} and B are ideals. An ideal different from B is called a proper ideal. A maxi- 
mal ideal is a proper ideal that is included in no other proper ideal. It can be 
shown that a proper ideal J is maximal if and only if, for any u in B, u € J or 
u’ &€ J. From the axiom of choice it can be proved that every Boolean algebra 
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contains a maximal ideal, or, equivalently, that every proper ideal is included 
in some maximal ideal. For example, let B be the set of all subsets of a set X; 
for Y € B, let Y’ = X -Y, and for Y and Z in B, let Y n Z be the ordinary set- 
theoretic intersection of Y and Z. Then (B,’ m) is a Boolean algebra. The 0 of B 
is the empty set @, and 1 is X. For each element u in X, the set J, of all subsets 
of X that do not contain u is a maximal ideal. For a detailed study of Boolean 
algebras, see Sikorski (1960), Halmos (1963), and Mendelson (1970). 
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The Propositional Calculus 


1.1 Propositional Connectives: Truth Tables 


Sentences may be combined in various ways to form more complicated sen- 
tences. We shall consider only truth-functional combinations, in which the 
truth or falsity of the new sentence is determined by the truth or falsity of its 
component sentences. 

Negation is one of the simplest operations on sentences. Although a sen- 
tence in a natural language may be negated in many ways, we shall adopt a 
uniform procedure: placing a sign for negation, the symbol -, in front of the 
entire sentence. Thus, if A is a sentence, then =A denotes the negation of A. 

The truth-functional character of negation is made apparent in the follow- 
ing truth table: 


A -AA 
T F 
FT 
When A is true, 7A is false; when A is false, =A is true. We use T and F to 
denote the truth values true and false. 

Another common truth-functional operation is the conjunction: “and.” The 
conjunction of sentences A and B will be designated by A A B and has the 
following truth table: 


A B AaB 
T T T 
F T F 
TF F 
FF F 


A AB is true when and only when both A and B are true. A and B are called 
the conjuncts of A A B. Note that there are four rows in the table, correspond- 
ing to the number of possible assignments of truth values to A and B. 

In natural languages, there are two distinct uses of “or”: the inclusive and 
the exclusive. According to the inclusive usage, “A or B” means “A or B or 
both,” whereas according to the exclusive usage, the meaning is “A or B, but 
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not both,” We shall introduce a special sign, Vv, for the inclusive connective. 
Its truth table is as follows: 


A B AvB 
T T T 
F T T 
T F T 
FF F 


Thus, A V B is false when and only when both A and B are false. “A v B” is 
called a disjunction, with the disjuncts A and B. 

Another important truth-functional operation is the conditional: “if A, then 
B.” Ordinary usage is unclear here. Surely, “if A, then B” is false when the 
antecedent A is true and the consequent B is false. However, in other cases, 
there is no well-defined truth value. For example, the following sentences 
would be considered neither true nor false: 


1. If 1 + 1 = 2, then Paris is the capital of France. 
2. If 1+ 1 #2, then Paris is the capital of France. 
3. If 1+ 1+ 2, then Rome is the capital of France. 


Their meaning is unclear, since we are accustomed to the assertion of some 
sort of relationship (usually causal) between the antecedent and the conse- 
quent. We shall make the convention that “if A, then B” is false when and 
only when A is true and B is false. Thus, sentences 1-3 are assumed to be 
true. Let us denote “if A, then B” by “A => B.” An expression “A => B” is called 
a conditional. Then => has the following truth table: 


A B A=>B 


T T T 
FT T 
T F F 
F F T 


This sharpening of the meaning of “if A, then B” involves no conflict with 
ordinary usage, but rather only an extension of that usage.* 


* There is a common non-truth-functional interpretation of “if A, then B” connected with 
causal laws. The sentence “if this piece of iron is placed in water at time t, then the iron will 
dissolve” is regarded as false even in the case that the piece of iron is not placed in water at 
time t—that is, even when the antecedent is false. Another non-truth-functional usage occurs 
in so-called counterfactual conditionals, such as “if Sir Walter Scott had not written any nov- 
els, then there would have been no War Between the States.” (This was Mark Twain’s conten- 
tion in Life on the Mississippi: “Sir Walter had so large a hand in making Southern character, as 
it existed before the war, that he is in great measure responsible for the war.”) This sentence 
might be asserted to be false even though the antecedent is admittedly false. However, causal 
laws and counterfactual conditions seem not to be needed in mathematics and logic. For a 
clear treatment of conditionals and other connectives, see Quine (1951). (The quotation from 
Life on the Mississippi was brought to my attention by Professor J.C. Owings, Jr.) 
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A justification of the truth table for > is the fact that we wish “if A and 
B, then B” to be true in all cases. Thus, the case in which A and B are true 
justifies the first line of our truth table for >, since (A and B) and B are both 
true. If A is false and B true, then (A and B) is false while B is true. This cor- 
responds to the second line of the truth table. Finally, if A is false and B is 
false, (A and B) is false and B is false. This gives the fourth line of the table. 
Still more support for our definition comes from the meaning of statements 
such as “for every x, if x is an odd positive integer, then x” is an odd positive 
integer.” This asserts that, for every x, the statement “if x is an odd positive 
integer, then x* is an odd positive integer” is true. Now we certainly do not 
want to consider cases in which x is not an odd positive integer as coun- 
terexamples to our general assertion. This supports the second and fourth 
lines of our truth table. In addition, any case in which x is an odd positive 
integer and x? is an odd positive integer confirms our general assertion. 
This corresponds to the first line of the table. 

Let us denote “A if and only if B” by “A © B.” Such an expression is called 
a biconditional. Clearly, A = B is true when and only when A and B have the 
same truth value. Its truth table, therefore is: 


A B ASB 
T T T 
FT F 
T F F 
FF T 


The symbols =, A, Vv, >, and © will be called propositional connectives.* Any 
sentence built up by application of these connectives has a truth value that 
depends on the truth values of the constituent sentences. In order to make 
this dependence apparent, let us apply the name statement form to an expres- 
sion built up from the statement letters A, B, C, and so on by appropriate appli- 
cations of the propositional connectives. 


1. All statement letters (capital italic letters) and such letters with 
numerical subscripts* are statement forms. 

2. If Zand vare statement forms, then so are (=.7), (7A 7), (ZV 7), 
(47> 7), and (7 7). 


* We have been avoiding and shall in the future avoid the use of quotation marks to form 
names whenever this is not likely to cause confusion. The given sentence should have quota- 
tion marks around each of the connectives. See Quine (1951, pp. 23-27). 

+ For example, A, Ay, Ayz, B3,, C;, «... 
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3. Only those expressions are statement forms that are determined 
to be so by means of conditions 1 and 2.* Some examples of state- 
ment forms are B, (=C,), (D; A (-B)), ((-B,) V B,) > (A, A C,)), and 
(CA) & A) o (C> Bv OC). 


For every assignment of truth values T or F to the statement letters that occur 
in a statement form, there corresponds, by virtue of the truth tables for the 
propositional connectives, a truth value for the statement form. Thus, each 
statement form determines a truth function, which can be graphically repre- 
sented by a truth table for the statement form. For example, the statement 
form (A) Vv B) > C) has the following truth table: 


A BC (AA) (CA)VB)  ((A)VB)> C) 
T T TF iE T 
EO oe. L T 
Ty ee P| ve F T 
F F T T T T 
T T FF T F 
F T F T T F 
T F FF F T 
F F F T [ F 


Each row represents an assignment of truth values to the statement letters 
A, B, and C and the corresponding truth values assumed by the statement 
forms that appear in the construction of (((-A) v B) > C). 

The truth table for (A <= B) > (=A) A B)) is as follows: 


A B (ASB) (AA) (CHA)AB) (A&B) => (A) B)) 
 E T F F F 
FT F t T T 
T F F F F T 
FF T T F F 


If there are n distinct letters in a statement form, then there are 2” possible 
assignments of truth values to the statement letters and, hence, 2” rows in 
the truth table. 


* This can be rephrased as follows: 7 is a statement form if and only if there is a finite sequence 
Ay», 4,(n2 1) such that .4,= 7 and, if1<i<n, 4is either a statement letter or a negation, con- 
junction, disjunction, conditional, or biconditional constructed from previous expressions in 
the sequence. Notice that we use script letters vy, .4, 7, ... to stand for arbitrary expressions, 
whereas italic letters are used as statement letters. 


The Propositional Calculus 5 


A truth table can be abbreviated by writing only the full statement form, 
putting the truth values of the statement letters underneath all occurrences 
of these letters, and writing, step by step, the truth values of each component 
statement form under the principal connective of the form.* As an example, 
for (A @ B) > (GA) A B)), we obtain 


(A so By) => (GA) *” B)) 
i A! ame «Sa 5 GO 
BOG UP 2 ee OE. 
TB: OR SREY oR oF 
F T F F TF F F 


Exercises 


1.1 Let © designate the exclusive use of “or.” Thus, A © B stands for “A or 
B but not both.” Write the truth table for @. 


1.2. Construct truth tables for the statement forms ((A => B) v (-~A)) and 
(A >(B=>C)> (A= B) > (A>OC)). 
1.3. Write abbreviated truth tables for (A => B) A A) and ((A V (=C)) @ B). 


1.4 Write the following sentences as statement forms, using statement let- 
ters to stand for the atomic sentences—that is, those sentences that are 
not built up out of other sentences. 


a. If Mr Jones is happy, Mrs Jones is not happy, and if Mr Jones is not 
happy, Mrs Jones is not happy. 


b. Either Sam will come to the party and Max will not, or Sam will not 
come to the party and Max will enjoy himself. 


c. A-sufficient condition for x to be odd is that x is prime. 


d. A necessary condition for a sequence s to converge is that s be 
bounded. 


e. Anecessary and sufficient condition for the sheikh to be happy is 
that he has wine, women, and song. 


Fiorello goes to the movies only if a comedy is playing. 
The bribe will be paid if and only if the goods are delivered. 


> Oo os 


If x is positive, x? is positive. 
i. Karpov will win the chess tournament unless Kasparov wins 
today. 


* The principal connective of a statement form is the one that is applied last in constructing the 
form. 
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1.2 Tautologies 


A truth function of n arguments is defined to be a function of n arguments, the 
arguments and values of which are the truth values T or F. As we have seen, 
any statement form containing n distinct statement letters determines a cor- 
responding truth function of n arguments.* 

A statement form that is always true, no matter what the truth values of its 
statement letters may be, is called a tautology. A statement form is a tautol- 
ogy if and only if its corresponding truth function takes only the value T, 
or equivalently, if, in its truth table, the column under the statement form 
contains only Ts. An example of a tautology is (A v (-A)), the so-called law 
of the excluded middle. Other simple examples are (-(A A (-A))), (A @ GCA), 
((A A B) => A), and (A = (A v B)). 

zis said to logically imply ~ (or, synonymously, vis a logical consequence of 4) 
if and only if every truth assignment to the statement letters of .zand + that 
makes 7 true also makes + true. For example, (A A B) logically implies A, A 
logically implies (A v B), and (A A (A = B)) logically implies B. 

gand “are said to be logically equivalent if and only if and 7 receive the 
same truth value under every assignment of truth values to the statement 
letters of .7and 7. For example, A and (~(-A)) are logically equivalent, as are 
(A A B) and (B A A). 


* To be precise, enumerate all statement letters as follows: A, B, ..., Z; Ay, By, ..., Z;; Az, ... Ifa 
statement form contains the i,m, ..., i, Statement letters in this enumeration, where 1, < + <i, 
then the corresponding truth function is to have x;,, ..., x;,, in that order, as its arguments, 


where x;, corresponds to the in statement letter. For example, (A > B) generates the truth 
function: 


dMoaomry 


whereas (B => A) generates the truth function: 


MX B(X1,X2) 


dMoamry 
MmHaxH 
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Proposition 1.1 


a. “logically implies vif and only if (77> 7) is a tautology. 
b. wand care logically equivalent if and only if (7 7) is a tautology. 


Proof 


a. (i) Assume .7 logically implies ~. Hence, every truth assignment 
that makes .7 true also makes + true. Thus, no truth assignment 
makes .4 true and + false. Therefore, no truth assignment makes 
(4= 7) false, that is, every truth assignment makes (.7=> 7) true. 
In other words, (7=> 7) is a tautology. (ii) Assume (47> 7) is a 
tautology. Then, for every truth assignment, (7=> 7) is true, and, 
therefore, it is not the case that vis true and ~ false. Hence, every 
truth assignment that makes .7 true makes + true, that is, .7 logi- 
cally implies 7. 

b. (v@ “isa tautology if and only if every truth assignment makes (7 7) 
true, which is equivalent to saying that every truth assignment gives 

gand ~the same truth value, that is,.7and are logically equivalent. 


By means of a truth table, we have an effective procedure for determining 
whether a statement form is a tautology. Hence, by Proposition 1.1, we have 
effective procedures for determining whether a given statement form logi- 
cally implies another given statement form and whether two given statement 
forms are logically equivalent. 

To see whether a statement form is a tautology, there is another method 
that is often shorter than the construction of a truth table. 


Examples 
1. Determine whether ((A © (-B) v C)) > (AA) = B)) is a tautology. 


Assume that the statement form ((A © (GB) v C)) => (AA) => B)) 
sometimes is F (line 1). Then (A F 

(-B)V C)isTand(-A)>B)isF 7 F 
(line 2). Since (GA) => B) is F, GA) T EF 
is T and B is F (line 3). Since A)is 
T, A is F (line 4). Since A is F and 
(A = (B) v C)) is T, (-B) v C) is F 
(line 5). Since (-B) v ©) is F, CB) FOF 
and C are F (line 6). Since (-B) is F, I 

Bis T (line 7). But Bis both T and F 

(lines 7 and 3). Hence, it is impos- 

sible for the form to be false. 


zs 
eal 
ND OP WN FR 
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2. Determine whether (A => (B v C)) v (A => B)) is a tautology. 


Assume that the form is F (A>(BVC))vV(A=>B)) 
(line 1). Then (A => (B v C)) and F 

(A => B) are F (line 2). Since F F 

(A > B)isE,A isT and BisF | ae 
(line 3). Since (A>(BVC)isE T B 

A is T and (B v C) is F (line 4). FF 


OPRWNFR 


Since (B Vv C) is F B and C are 
F (line 5). Thus, when A is T, B 
is F, and C is F, the form is F. 
Therefore, it is not a tautology. 


Exercises 


1.5 Determine whether the following are tautologies. 


1.6 


1.7 


1.8 


(A => B) > B) => B) 

((A => B) > B)=> A) 

((A > B) => A)=> A) 

((B > C) > (A= B)) => (A= B)) 
(Av (GB A C)) > (A & ©) vB) 
(A > (B= (B= A))) 

(A AB) > (AVC) 

(A@B)e (As (BS A))) 

(A => B) v (B= A)) 

(CA = B)) > A) 


Determine whether the following pairs are logically equivalent. 


(A= B)=>A)andA 

(A @ B) and (A => B) A (B= A)) 

((AA) Vv B) and (B) v A) 

(-(A © B)) and (A = (-B)) 

(A v (B &C)) and (A v B) s (A v C)) 
(A > (BO) and (A => B) 3 (A=>OC) 
(A A(B#C)) and (A AB) & (AA C)) 


Prove: 


a. 
b. 


(A => B) is logically equivalent to (A) v B). 
(A => B) is logically equivalent to (-(A A (-B))). 


Prove that vis logically equivalent to “if and only if logically implies 
cand ~ logically implies 4. 
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1.9 Show that wand vare logically equivalent if and only if, in their truth 
tables, the columns under “and /are the same. 


1.10 Prove that and ¢are logically equivalent if and only if (4 and (7) 
are logically equivalent. 


1.11 Which of the following statement forms are logically implied by (A A B)? 


a A 

b. B 

c. (AVB) 

d. (GA) vB) 

e. (GB) => A) 

f. (AB) 

g. (A=>B) 

h. (CB) > CA) 

i. (AA (CB)) 
1.12 Repeat Exercise 1.11 with (A A B) replaced by (A > B) and by (-(A = B)), 

respectively. 


1.13 Repeat Exercise 1.11 with (A A B) replaced by (A v B). 


1.14 Repeat Exercise 1.11 with (A A B) replaced by (A © B) and by (-(A © B)), 
respectively. 


A statement form that is false for all possible truth values of its statement 
letters is said to be contradictory. Its truth table has only Fs in the column 
under the statement form. One example is (A = (=A)): 


A (-A) (As (A)) 
T F F 
F T F 

Another is (A A (=A)). 

Notice that a statement form vis a tautology if and only if (-.4) is contra- 
dictory, and vice versa. 

A sentence (in some natural language like English or in a formal theory)* 
that arises from a tautology by the substitution of sentences for all the state- 
ment letters, with occurrences of the same statement letter being replaced by 
the same sentence, is said to be logically true (according to the propositional 
calculus). Such a sentence may be said to be true by virtue of its truth-func- 
tional structure alone. An example is the English sentence, “Tf it is raining or 
it is snowing, and it is not snowing, then it is raining,” which arises by substi- 
tution from the tautology (A v B) A (-B)) > A). A sentence that comes from 


* By a formal theory we mean an artificial language in which the notions of meaningful expres- 
sions, axioms, and rules of inference are precisely described (see page 27). 
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a contradictory statement form by means of substitution is said to be logically 
false (according to the propositional calculus). 
Now let us prove a few general facts about tautologies. 


Proposition 1.2 
If .zand (7> 7) are tautologies, then so is 7. 


Proof 


Assume that .7 and (7 => 7) are tautologies. If ~ took the value F for some 
assignment of truth values to the statement letters of .7 and 7, then, since .7 
is a tautology, .7 would take the value T and, therefore, (7 => 7) would have 
the value F for that assignment. This contradicts the assumption that (7> 7) 
is a tautology. Hence, “never takes the value F. 


Proposition 1.3 


If 7 is a tautology containing as statement letters A,, A,, ..., A,, and .7 
arises from 7 by substituting statement forms 4, .%, ...,.4%, for Ay, Ay ..., Ay 
respectively, then .7is a tautology; that is, substitution in a tautology yields 
a tautology. 


Example 
Let 7 be (A; A A,) > A,), let .4 be (B Vv C) and let .4 be (C A D). Then .7is 
((BVC)A(CCAD)) > (BV C)). 


Proof 


Assume that .7 is a tautology. For any assignment of truth values to the state- 
ment letters in .% the forms .4, ...,.%, have truth values x,, ..., x,, (where each 
x; is T or F). If we assign the values x,, ..., x, to A; ..., A, respectively, then 
the resulting truth value of 7 is the truth value of 7 for the given assign- 
ment of truth values. Since 7 is a tautology, this truth value must be T. Thus, 
Z always takes the value T. 


Proposition 1.4 


If 4 arises from .4 by substitution of for one or more occurrences of .4, then 
(49349 >(4e ~)) isa tautology. Hence, if.zand ~are logically equivalent, 
then so are .4 and 74. 
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Example 

Let “¢ be (7 v D), let 7 be 7, and let 7 be (=(47)). Then 7, is (A(47)) v D). Since 
7 and (-(+7)) are logically equivalent, (7 v D) and (77) v D) are also logi- 
cally equivalent. 


Proof 


Consider any assignment of truth values to the statement letters. If .7 and 
“have opposite truth values under this assignment, then (7 @ 7) takes the 
value F, and, hence, (7 @ 7) > (4 @ 7,)) is T. If. Zand / take the same truth 
values, then so do .4 and + ,, since 7 , differs from .4 only in containing 
~in some places where .7, contains .% Therefore, in this case, (7 7) is T, 
(4 @ 4) is T, and, thus, (47@ 4 >(4¢ 4)) is T. 


Parentheses 


It is profitable at this point to agree on some conventions to avoid the use 
of so many parentheses in writing formulas. This will make the reading of 
complicated expressions easier. 

First, we may omit the outer pair of parentheses of a statement form. (In the 
case of statement letters, there is no outer pair of parentheses.) 

Second, we arbitrarily establish the following decreasing order of strength 
of the connectives: =, A, V, >, . Now we shall explain a step-by-step process 
for restoring parentheses to an expression obtained by eliminating some or 
all parentheses from a statement form. (The basic idea is that, where possible, 
we first apply parentheses to negations, then to conjunctions, then to disjunc- 
tions, then to conditionals, and finally to biconditionals.) Find the leftmost 
occurrence of the strongest connective that has not yet been processed. 


i. If the connective is = and it precedes a statement form .%, restore left 
and right parentheses to obtain (4.7). 


ii. If the connective is a binary connective C and it is preceded by a state- 
ment form .7and followed by a statement form 7, restore left and right 
parentheses to obtain (.7C 7). 


iii. If neither (i) nor (ii) holds, ignore the connective temporarily and find 
the leftmost occurrence of the strongest of the remaining unprocessed 
connectives and repeat (i-iii) for that connective. 


Examples 
Parentheses are restored to the expression in the first line of each of the fol- 
lowing in the steps shown: 
1 ASCB)VCSA 
AsS(-B)vC)>A 
A ((-B) Vv C) >A) 
(A = (CB) v C) > A) 
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2, AS>-BSC 


A>(B)>C 

(A> (-B))>C 
(A= CB)) > C) 
B>—-A 
B=>-(7A) 

B => (-CA)) 

(B > GCA) 

Av -~(B=>Av B) 
Av -(B => (Av B)) 
AV (7(B = (A V B))) 
(Av CB = (Av B))) 


Not every form can be represented without the use of parentheses. For exam- 
ple, parentheses cannot be further eliminated from A => (B > C), since A > 
B => Cstands for (A > B) > C). Likewise, the remaining parentheses cannot 
be removed from -(A Vv B) or from A A (B > C). 


Exercises 


1.15 Eliminate as many parentheses as possible from the following forms. 


a. (B> (CA) AC) 

(Av (BV C)) 

(AA CB) AC) v D) 

(By CC) v (AA B)) 

(A > B) = (Cv D)) 

(GOCE V CO) + B= ©) 
(CCE V ©) + (B = ©) 

(GA = B)> (C= D)aCd)vo) 
Restore parentheses to the following forms. 
a. CV7AAAB 

b. BS>7~TAAC 

c CS>7AAABSC)AASB 

d. C>AS>AS-AVB 


Determine whether the following expressions are abbreviations of 
statement forms and, if so, restore all parentheses. 


a. TASASBVC 
b. AFA SA)SBVC 
c 7AA>B)VCVD=>B 


Pam nro an & 
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d. 
e. 
f. 


Ae@(CAVvB)>(AA(BVC)) 
AAVBVCADSAA-7A 
(A>BA(CVD)A(AVD)) 


1.18 If we write —.7instead of (7.4), > 7 instead of (7> 7), 7c instead of 
(AA 0), VZ instead of (ZV 7), and &7 instead of (7 7), then there 
is no need for parentheses. For example, ((-A) A (B > (-D))), which is 
ordinarily abbreviated as -A A (B > —=D), becomes A ~A => B =D. This 
way of writing forms is called Polish notation. 


1.19 


1.20 


a. 
b. 


a 


Write ((C > GA)) v B) and (C v (B A (-D)) => C)) in this notation. 

If we count >, A, V, and © each as +1, each statement letter as —1 
and ~ as 0, prove that an expression .7 in this parenthesis-free nota- 
tion is a statement form if and only if (i) the sum of the symbols of 
Zis —1 and (ii) the sum of the symbols in any proper initial segment 
of zis nonnegative. (If an expression .7 can be written in the form 
77, where ¢# .4, then vis called a proper initial segment of .%) 


Write the statement forms of Exercise 1.15 in Polish notation. 


. Determine whether the following expressions are statement forms 


in Polish notation. If so, write the statement forms in the standard 
way. 
i. ==> ABC v AB-=AC 
iii >> AB>> BC >-AC 
liiw VAV-7AA-BCAV ACV -CA7AA 
iv. VABA BBB 


Determine whether each of the following is a tautology, is contradic- 
tory, or neither. 


j. 


Pwd wo aa oF Pp 


Bs(BvB) 
(A=>B)AB)>A 
(A) => (A A B) 
(A>B)>(BS>C)> (A= 0) 
(Ae-B)>AVB 

AA (A Vv B)) 

(A = B) s (GA) vB) 

(A > B) = -(A A (-B)) 
(BS(BSA)>S>A 
AA7A=>B 


If A and B are true and C is false, what are the truth values of the fol- 
lowing statement forms? 


a. 
b. 


AVC 
AAC 
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c 7AAAAC 

d. AsS-BvC 

e. BVACDS>A 

f. (BV A)>(B>-C) 

g (BS>-ASs(ASQC) 
h. (BS>A)> (A> -C) > CC = B)) 

1.21 If A => B is T, what can be deduced about the truth values of the 
following? 
a AVCS>BVC 
b. AACSBAC 
c "AAABSAVB 

1.22 What further truth values can be deduced from those shown? 
a. 7AV(A=>B) 


F 
b. -A(AAB)@7A>-B 
T 
ce. (-AvB)>(A>-C) 
F 
d. (AsB)s(C=>—-A) 
F T 
1.23 If A @ B is F, what can be deduced about the truth values of the 
following? 
a. AAB 
b. AVB 
c A>B 


d. AACSBAC 
1.24 Repeat Exercise 1.23, but assume that A © B is T. 
1.25 What further truth values can be deduced from those given? 
a. (AAB)@(AVB) 
F F 
b. (A>-B)>(C=> 8B) 
F 
1.26 a. Apply Proposition 1.3 when 7 is A; > A; V A,.4 is B A D, and 4 
is “B. 
b. Apply Proposition 1.4 when .4 is (B > C) A D,.7 is B > C, and 7 
is “BV C. 
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1.27 Show that each statement form in column I is logically equivalent to 
the form next to it in column IL 


I I 
A>(BS>C) (AAB)>C 
AA(BVC) (AAB)V(AAC) _ (Distributive law) 
AV(BAC) (AVB)A(AVC) | (Distributive law) 
(AAB)V7~B Av-=AB 
(AVB)A7=B AA7B 


A=>B aB>-7A (Law of the contrapositive) 
AoB Boa (Biconditional commutativity) 
(ASB)eC AS(BSC) (Biconditional associativity) 
AoB (A AB) v (=A A-B) 

-=(A = B) Aeo-7B 

-(A v B) (A) A CB) (De Morgan's law) 

(A A B) (=A) v GB) (De Morgan's law) 


. AV(AAB) A 
AA(AVB) A 


AAB BAA (Commutativity of conjunction) 
AVB BVA (Commutativity of disjunction) 
(AAB)AC AA(BAC) (Associativity of conjunction) 
(AVB)vC Av(BVOC) (Associativity of disjunction) 
A@B BOA (Commutativity of exclusive “or”) 
A@®B)@®C AB(BOC) (Associativity of exclusive “or”) 


ee Ie ee Ss og ee PS ey Oe a eto 


AA(B@C) (AAB)@(AAC) (Distributive law) 


1.28 Show the logical equivalence of the following pairs. 
7A gand .4, where 7 is a tautology. 
7 zand .4where 7 is a tautology. 
*\ gand 4 where 7 is contradictory. 

1.29 a. Show the logical equivalence of =(A > B) and A A =B. 

. Show the logical equivalence of =(A © B) and (A A -B) v (7A A B). 


. For each of the following statement forms, find a statement form 
that is logically equivalent to its negation and in which negation 
signs apply only to statement letters. 


i. A>(Be-0) 
ii. “AV (BSC) 
iii, AA (BV 7C) 


a 
b 
c 
d. #V.vand .% where .7 is contradictory. 
a 
b 
c 
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1.30 (Duality) 


a. 


1.31 a. 


If 4 is a statement form involving only -, A, and v, and .7’ results 
from .7 by replacing each A by Vv and each v by A, show that .7 is a 
tautology if and only if =.’ is a tautology. Then prove that, if 4 > ¢ 
is a tautology, then so is +’ > 7, and if 7 7 isa tautology, then so 
is 4’ =~’. (Here 7 is also assumed to involve only -, A, and Vv.) 


Among the logical equivalences in Exercise 1.27, derive (c) from (b), 
(e) from (d), (1) from (k), (p) from (0), and (r) from (q). 

If .vis a statement form involving only -, A, and v, and .4* results 
from by interchanging A and Vv and replacing every statement let- 
ter by its negation, show that .#* is logically equivalent to —.4. Find a 
statement form that is logically equivalent to the negation of (A v B 
VC) A AV -B V D), in which = applies only to statement letters. 


Prove that a statement form that contains © as its only connective 
is a tautology if and only if each statement letter occurs an even 
number of times. 

Prove that a statement form that contains - and © as its only con- 
nectives is a tautology if and only if = and each statement letter 
occur an even number of times. 


1.32 (Shannon, 1938) An electric circuit containing only on-off switches 
(when a switch is on, it passes current; otherwise it does not) can be 
represented by a diagram in which, next to each switch, we put a letter 
representing a necessary and sufficient condition for the switch to be on 
(see Figure 1.1). The condition that a current flows through this network 
can be given by the statement form (A A B) v (C A 7A). A statement form 
representing the circuit shown in Figure 1.2 is (A A B) v (C v A) A =B), 
which is logically equivalent to each of the following forms by virtue 
of the indicated logical equivalence of Exercise 1.27. 


FIGURE 1.1 


A B\ 
C 1A 
A B 

\B 
A 


FIGURE 1.2 
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((AAB)v (Cv A))A((AAB)v-—B) () 
((AAB)v (Cv A)) (Av -B) (d) 
((AAB)v (Av C))a(Av —B) (p) 
((AAB)v A)v C)a(Av-B) (x) 

(Av C)A(Av-B) (p), (m) 
Av (Ca-B) ©) 


Hence, the given circuit is equivalent to the simpler circuit shown 
in Figure 1.3. (Two circuits are said to be equivalent if current flows 
through one if and only if it flows through the other, and one circuit is 
simpler if it contains fewer switches.) 


a. Find simpler equivalent circuits for those shown in Figures 1.4 
through 1.6. 


b. Assume that each of the three members of a committee votes yes on 
a proposal by pressing a button. Devise as simple a circuit as you 
can that will allow current to pass when and only when at least 
two of the members vote in the affirmative. 


c. We wish a light to be controlled by two different wall switches ina 
room in such a way that flicking either one of these switches will 
turn the light on if it is off and turn it off if it is on. Construct a 
simple circuit to do the required job. 


A 
€ \B 
FIGURE 1.3 
fi 
i i 
_B\—_ 
zi A 
Cc 
14 
[|__ p\__] 


FIGURE 1.4 
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(A 
A A 
L— B 
—S é B -————_ 
D C 
= =D. 


1.33 Determine whether the following arguments are logically correct by 
representing each sentence as a statement form and checking whether 


the 


conclusion is logically implied by the conjunction of the assump- 


tions. (To do this, assign T to each assumption and F to the conclusion, 
and determine whether a contradiction results.) 


a. 


b. 


If Jones is a communist, Jones is an atheist. Jones is an atheist. 
Therefore, Jones is a communist. 


If the temperature and air pressure remained constant, there was 
no rain. The temperature did remain constant. Therefore, if there 
was rain, then the air pressure did not remain constant. 


If Gorton wins the election, then taxes will increase if the deficit 
will remain high. If Gorton wins the election, the deficit will remain 
high. Therefore, if Gorton wins the election, taxes will increase. 


If the number x ends in 0, it is divisible by 5. x does not end in 0. 
Hence, x is not divisible by 5. 


If the number x ends in 0, it is divisible by 5. x is not divisible by 5. 
Hence, x does not end in 0. 


Ifa =0 or b=0, then ab = 0. But ab + 0. Hence, a #0 and b +0. 


A sufficient condition for f to be integrable is that g be bounded. 
A necessary condition for / to be continuous is that f is integrable. 
Hence, if g is bounded or h is continuous, then f is integrable. 
Smith cannot both be a running star and smoke cigarettes. Smith is 
not a running star. Therefore, Smith smokes cigarettes. 

If Jones drove the car, Smith is innocent. If Brown fired the gun, 
then Smith is not innocent. Hence, if Brown fired the gun, then 
Jones did not drive the car. 
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1.34 Which of the following sets of statement forms are satisfiable, in the 
sense that there is an assignment of truth values to the statement let- 
ters that makes all the forms in the set true? 


a. A>B 
BSC 
CvDe-B 

b. =(-B v A) 
Av-AC 
B>-C 

c D>B 
Av -B 
-(D A A) 
D 


1.35 Check each of the following sets of statements for consistency by rep- 
resenting the sentences as statement forms and then testing their con- 
junction to see whether it is contradictory. 


a. Either the witness was intimidated or, if Doherty committed 
suicide, a note was found. If the witness was intimidated, then 
Doherty did not commit suicide. If a note was found, then Doherty 
committed suicide. 


b. The contract is satisfied if and only if the building is completed 
by 30 November. The building is completed by 30 November 
if and only if the electrical subcontractor completes his work by 
10 November. The bank loses money if and only if the contract is 
not satisfied. Yet the electrical subcontractor completes his work by 
10 November if and only if the bank loses money. 


1.3 Adequate Sets of Connectives 


Every statement form containing n statement letters generates a correspond- 
ing truth function of n arguments. The arguments and values of the func- 
tion are T or F. Logically equivalent forms generate the same truth function. 
A natural question is whether all truth functions are so generated. 


Proposition 1.5 


Every truth function is generated by a statement form involving the connec- 
tives 7, A, and Vv. 
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Proof 


(Refer to Examples 1 and 2 below for clarification.) Let f(x,, ..., x,,) be a truth func- 
tion. Clearly f can be represented by a truth table of 2" rows, where each row 
represents some assignment of truth values to the variables x, ..., x,, followed 
by the corresponding value of f(x,, ..., x,). If 1 <i< 2", let C; be the conjunction 
Ui Aun... hi, where Li} is A, if, in the ith row of the truth table, x; takes the 
value T, and U; is 7A; if x; takes the value F in that row. Let D be the disjunction of 
all those Cjs such that fhas the value T for the ith row of the truth table. (If there 
are no such rows, then f always takes the value F, and we let Dbe A, A 7A, which 
satisfies the theorem.) Notice that D involves only =, A, and Vv. To see that D has 
fas its corresponding truth function, let there be given an assignment of truth 
values to the statement letters A,, ..., A,, and assume that the corresponding 
assignment to the variables x,, ..., x, is row k of the truth table for f Then C, has 
the value T for this assignment, whereas every other C; has the value F. If f has 
the value T for row k, then C;, is a disjunct of D. Hence, D would also have the 
value T for this assignment. If f has the value F for row k, then C, is not a dis- 
junct of D and all the disjuncts take the value F for this assignment. Therefore, 
D would also have the value F. Thus, D generates the truth function f 


Examples 
1. XX f (X41, X2) 
T T F 
F T T 
T F T 
FF T 


Dis (A, AA,) V (A, A 7A.) V GA, A AA). 


N 
B 

R 
N 

R 
w 


8 (x1, X2, X3) 
T 


DaHemamHeas7” 
Toda mmaAasy 
oo eo eo ee ee | 
Hmmm Hyo 


Dis (Ay A Ad AA3)V (At AAA, A A3)V (AA, AaA2 A A3) 
Vv (AA, A AA, A —=A3). 
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Exercise 


1.36 Find statement forms in the connectives 7, A, and Vv that have the fol- 
lowing truth functions. 


xX. xX X3 Facer Xo, x3) g(x, X2, x3) h(x1, Xo, x3) 
T T T T T F 
F T T T T T 
T F T T T F 
F F T F F F 
T TT F F T T 
F T F F F T 
T F F F T F 
F F F T F T 


Corollary 1.6 


Every truth function can be generated by a statement form containing as 
connectives only A and -, or only v and -, or only > and =. 


Proof 


Notice that .7 v ~ is logically equivalent to -G.7 A 77). Hence, by the sec- 
ond part of Proposition 1.4, any statement form in A, Vv, and — is logically 
equivalent to a statement form in only A and — [obtained by replacing all 
expressions 7V ¢ by >(.7A 77)]. The other parts of the corollary are similar 
consequences of the following tautologies: 


BRE A(7ABvA~) 

EVE &(AF>F) 

BNC a B=> 7) 
We have just seen that there are certain pairs of connectives—for exam- 
ple, A and =—in terms of which all truth functions are definable. It turns 


out that there is a single connective, | (joint denial), that will do the same 
job. Its truth table is 


A B AB 
Tr -F F 
F T F 
T F F 
F F T 
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A | Bis true when and only when neither A nor B is true. Clearly, -A = (A | 
A) and (A A B) & (A | A) | (B | B)) are tautologies. Hence, the adequacy of | 
for the construction of all truth functions follows from Corollary 1.6. 

Another connective, | (alternative denial), is also adequate for this pur- 
pose. Its truth table is 


A B_A|B 
T T F 
BT 
fe. FT 
F F T 


A|B is true when and only when not both A and B are true. The adequacy 
of | follows from the tautologies =A = (A|A) and (A v B) @ ((A|A)|(B|B)). 


Proposition 1.7 


The only binary connectives that alone are adequate for the construction 
of all truth functions are | and |. 


Proof 


Assume that h(A, B) is an adequate connective. Now, if h(T, T) were T, then 
any statement form built up using / alone would take the value T when all 
its statement letters take the value T. Hence, =A would not be definable in 
terms of h. So, h(T, T) = F. Likewise, h(F, F) = T. Thus, we have the partial 
truth table: 


AB W(A,B) 
on F 
FT 

T &F 

FF it 


If the second and third entries in the last column are F, F or T, T, then h is | 
or |. If they are F, T, then h(A, B) © —B is a tautology; and if they are T, F, then 
h(A, B) & 7A is a tautology. In both cases, h would be definable in terms of 7. 
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But — is not adequate by itself because the only truth functions of one vari- 
able definable from it are the identity function and negation itself, whereas 
the truth function that is always T would not be definable. 


Exercises 


1.37 


1.38 


1.40 


1.41 


1.42 


Prove that each of the pairs >, Vv and , © is not alone adequate to 
express all truth functions. 


a. Prove that A v B can be expressed in terms of => alone. 

b. Prove that A A B cannot be expressed in terms of => alone. 

c. Prove that A = B cannot be expressed in terms of > alone. 

Show that any two of the connectives {A, >, ©} serve to define the 
remaining one. 

With one variable A, there are four truth functions: 


A aA AvVAA AAAA 
T F T F 
F T T F 


a. With two variable A and B, how many truth functions are there? 
b. How many truth functions of n variables are there? 


Show that the truth function i determined by (A v B) > -C generates 
all truth functions. 


By a literal we mean a statement letter or a negation of a statement 

letter. A statement form is said to be in disjunctive normal form (dnf) 

if it is a disjunction consisting of one or more disjuncts, each of 

which is a conjunction of one or more literals—for example, (A A B) 

VGAAC)(AABARAA)V (CARB) V (AA 7C), A, A AB, and A Vv (B 

Vv C). A form is in conjunctive normal form (cnf) if it is a conjunction 

of one or more conjuncts, each of which is a disjunction of one or 

more literals—for example, (B Vv C) A (A v B), (BV =C) A(AV D), AA 

(BV A)A (BV A), AV —=B,A AB, A. Note that our terminology con- 

siders a literal to be a (degenerate) conjunction and a (degenerate) 

disjunction. 

a. The proof of Proposition 1.5 shows that every statement form .7is 
logically equivalent to one in disjunctive normal form. By applying 
this result to -.4, prove that .7is also logically equivalent to a form 
in conjunctive normal form. 
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b. Find logically equivalent dnfs and cnfs for -(A => B) v A AC) and 
A & ((B A 7A) V C). [Hint: Instead of relying on Proposition 1.5, it is 
usually easier to use Exercise 1.27(b) and (c).] 

c. Adnf (nf) is called full if no disjunct (conjunct) contains two occur- 
rences of literals with the same letter and if a letter that occurs in 
one disjunct (conjunct) also occurs in all the others. For example, 
(A A7A AB) Vv (A AB), (BABAC) V (BAC) and (BAC) Vv B are 
not full, whereas (A A BA =C) Vv (A ABAC)V (AA 7B A 7C) and 
(A A =B) v (B A A) are full dnfs. 


i. Find full dnfs and cnfs logically equivalent to (A A B) v =A 
and =(A > B) v (“A AC). 


ii. Prove that every noncontradictory (nontautologous) state- 
ment form .7 is logically equivalent to a full dnf (cnf) 7, and, 
if ~ contains exactly n letters, then 7 is a tautology (is contra- 
dictory) if and only if ~ has 2” disjuncts (conjuncts). 


d. For each of the following, find a logically equivalent dnf (cnf), and 
then find a logically equivalent full dnf (cnf): 


i. (AV B)ACBVC) 
iii 7AV(B>-C) 
iti, (AA 7B) V (AAC) 
iv. (Av B) SAC 
e. Construct statement forms in — and A (respectively, in = and V or in 
- and =) logically equivalent to the statement forms in (d). 


A statement form is said to be satisfiable if it is true for some assignment 
of truth values to its statement letters. The problem of determining the 
satisfiability of an arbitrary cnf plays an important role in the theory of 
computational complexity; it is an example of a so-called . complete 
problem (see Garey and Johnson, 1978). 


a. Show that vis satisfiable if and only if =.7is not a tautology. 
b. Determine whether the following are satisfiable: 
ip (AVB)ACAVBVC)ACAV-BYV-7C) 
iii (AS>B)VC) SCBA (AVC) 
c. Given a disjunction 7 of four or more literals: L,; v L, v ... V L,, let 


C,, ..., C,_» be statement letters that do not occur in D, and construct 
the cnf @: 


(Lv Lp v Cy) A (AC, v Lg VC2)A(AC2 V Lav C3) A... 
N=ACy-3 Vv Ly-1 Vv C2) N (AC, _2 Vv L, Vv =C;) 


Show that any truth assignment satisfying 7 can be extended 
to a truth assignment satisfying « and, conversely, any truth 
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assignment satisfying «is an extension of a truth assignment sat- 
isfying 7. (This permits the reduction of the problem of satisfy- 
ing cnfs to the corresponding problem for cnfs with each conjunct 
containing at most three literals.) 


d. For a disjunction 7 of three literals L,; v L, v Lz, show that a 
form that has the properties of “in (c) cannot be constructed, 
with “a cnf in which each conjunct contains at most two literals 
(R. Cowen). 


1.44 (Resolution) Let .zbe a cnf and let C be a statement letter. If C is a dis- 
junct of a disjunction 4 in wand -C is a disjunct of another disjunc- 
tion % in .4, then a nonempty disjunction obtained by eliminating C 
from “ and +7 from % and forming the disjunction of the remaining 
literals (dropping repetitions) is said to be obtained from .4 by resolu- 
tion on C. For example, if .7is 


(Av =Cv =B) a (=A v Dv =B) A (Cv Dv A) 


the first and third conjuncts yield A v =B v D by resolution on C. In 
addition, the first and second conjuncts yield =C v =B v D by resolu- 
tion on A, and the second and third conjuncts yield D v —-B v C by 
resolution on A. If we conjoin to .zany new disjunctions obtained by 
resolution on all variables, and if we apply the same procedure to 
the new cnf and keep on iterating this operation, the process must 
eventually stop, and the final result is denoted .v.(.4). In the example, 
AA) is 


(Av =Cv =B)a (=A v Dv =B) A (Cv Dv A) A(=Cv -=Bv D) 
A Dv =By C)A(Av =Bv D)A(Dv-B) 


Notice that we have not been careful about specifying the order in 
which conjuncts or disjuncts are written, since any two arrangements 
will be logically equivalent.) 


a. Find %.(%) when B is each of the following: 
i. (AV-=B)AB 
ii, (AVBVC)A(AV-BVC) 
iii, (AVC)ACAVB)A(AV->AC) A GCA V -=B) 
b. Show that 4 logically implies .7.(4). 
c. If.visacnf, let.~ be the cnf obtained from .7by deleting those con- 


juncts that contain C or -C. Let r-(.7) be the cnf that is the conjunc- 
tion of 4 and all those disjunctions obtained from .7 by resolution 
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on C. For example, if .7 is the cnf in the example above, then r¢(.4) 
is (AA vV Dv =B) A (A Vv -B v D). Prove that, if r-(.4) is satisfiable, 
then so is .7(R. Cowen). 


d. Acnf.#is said to be a blatant contradiction if it contains some letter 
C and its negation -C as conjuncts. An example of a blatant contra- 
diction is (A V B) A BA (Cv D) A 3B. Prove that if .7is unsatisfiable, 
then %.(.4) is a blatant contradiction. [Hint: Use induction on the 
number n of letters that occur in .% In the induction step, use (c).] 


e. Prove that .7 is unsatisfiable if and only if (%) is a blatant 
contradiction. 


1.45 Let.vand “be statement forms such that .7=> 7is a tautology. 


a. If zand have no statement letters in common, show that either .7 
is contradictory or 7is a tautology. 


b. (Craig’s interpolation theorem) If .7 and 7 have the statement letters 
B,, ..., B, in common, prove that there is a statement form “having 
B,, ...,B,,as its only statement letters such that .7> ~and 7> are 
tautologies. 


c. Solve the special case of (b) in which .vis (B, > A) A (A => B,) and 7 
is (B, AC) > (BL, AC). 

1.46 a. A certain country is inhabited only by truth-tellers (people who 
always tell the truth) and liars (people who always lie). Moreover, 
the inhabitants will respond only to yes or no questions. A tourist 
comes to a fork in a road where one branch leads to the capital 
and the other does not. There is no sign indicating which branch 
to take, but there is a native standing at the fork. What yes or 
no question should the tourist ask in order to determine which 
branch to take? [Hint: Let A stand for “You are a truth-teller” 
and let B stand for “The left-hand branch leads to the capital.” 
Construct, by means of a suitable truth table, a statement form 
involving A and B such that the native’s answer to the question as 
to whether this statement form is true will be yes when and only 
when B is true.] 


b. Inacertain country, there are three kinds of people: workers (who 
always tell the truth), businessmen (who always lie), and students 
(who sometimes tell the truth and sometimes lie). At a fork in the 
road, one branch leads to the capital. A worker, a businessman and 
a student are standing at the side of the road but are not identifiable 
in any obvious way. By asking two yes or no questions, find out 
which fork leads to the capital (Each question may be addressed to 
any of the three.) 


More puzzles of this kind may be found in Smullyan (1978, Chapter 3; 1985, 
Chapters 2, 4 through 8). 
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1.4 An Axiom System for the Propositional Calculus 


Truth tables enable us to answer many of the significant questions concerning 
the truth-functional connectives, such as whether a given statement form is a 
tautology, is contradictory, or neither, and whether it logically implies or is logi- 
cally equivalent to some other given statement form. The more complex parts 
of logic we shall treat later cannot be handled by truth tables or by any other 
similar effective procedure. Consequently, another approach, by means of for- 
mal axiomatic theories, will have to be tried. Although, as we have seen, the 
propositional calculus surrenders completely to the truth table method, it will 
be instructive to illustrate the axiomatic method in this simple branch of logic. 
A formal theory .v is defined when the following conditions are satisfied: 


1. A countable set of symbols is given as the symbols of ..* A finite 
sequence of symbols of .” is called an expression of .. 


2. There is a subset of the set of expressions of ./ called the set of well- 
formed formulas (wfs) of .~ There is usually an effective procedure to 
determine whether a given expression is a wf. 


3. There is a set of wfs called the set of axioms of .~ Most often, one can 
effectively decide whether a given wf is an axiom; in such a case, .” is 
called an axiomatic theory. 


4. There is a finite set R,, ..., R,, of relations among wfs, called rules of infer- 
ence. For each R,, there is a unique positive integer j such that, for every 
set of j wfs and each wf .4, one can effectively decide whether the given 
j wfs are in the relation R; to 4, and, if so, is said to follow from or to be 
a direct consequence of the given wfs by virtue of R;+ 


A proof in is a sequence .4, ..., 4 of wfs such that, for each i, either 4 is an 
axiom of .y or .4 is a direct consequence of some of the preceding wfs in the 
sequence by virtue of one of the rules of inference of .. 

A theorem of 7 is a wf #of such that vis the last wf of some proof in . 
Such a proof is called a proof of .7in 

Even if ./is axiomatic—that is, if there is an effective procedure for check- 
ing any given wf to see whether it is an axiom—the notion of “theorem” is 
not necessarily effective since, in general, there is no effective procedure for 
determining, given any wf .4, whether there is a proof of .7 A theory for 
which there is such an effective procedure is said to be decidable; otherwise, 
the theory is said to be undecidable. 


* These “symbols” may be thought of as arbitrary objects rather than just linguistic objects. 
This will become absolutely necessary when we deal with theories with uncountably many 
symbols in Section 2.12. 

* An example of a rule of inference will be the rule modus ponens (MP): 7 follows from 4 and 

4=> ¢. According to our precise definition, this rule is the relation consisting of all ordered 
triples (.4,.7=> 7, 7), where and ¢are arbitrary wfs of the formal system. 
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From an intuitive standpoint, a decidable theory is one for which a machine 
can be devised to test wfs for theoremhood, whereas, for an undecidable 
theory, ingenuity is required to determine whether wfs are theorems. 

A wf @ is said to be a consequence in ./of a set T of wfs if and only if there 
is a sequence 4, ...,.4 of wfs such that vis .4 and, for each i, either .4 is an 
axiom or 4 is inT, or 4 is a direct consequence by some rule of inference of 
some of the preceding wfs in the sequence. Such a sequence is called a proof 
(or deduction) of ~from I: The members of I are called the hypotheses or prem- 
isses of the proof. We use + as an abbreviation for “7 is a consequence of 
I”. In order to avoid confusion when dealing with more than one theory, we 
write [ +, adding the subscript .7 to indicate the theory in question. 

If T is a finite set {7 , ..., 7%}, we write %, ..., %, / ~instead of {%, ..., 
7} F 7. If is the empty set @, then @ F vif and only if C is a theorem. It is 
customary to omit the sign “@” and simply write Fv. Thus, Fis another way 
of asserting that ~ is a theorem. 

The following are simple properties of the notion of consequence: 


1. If CAandI Fy thenArkw. 
2. It vif and only if there is a finite subset A of F such that AF v. 
3. If Ab v, and for each vin A, .%, thenT F v. 


Assertion 1 represents the fact that if ~is provable from a set I of premisses, 
then, if we add still more premisses, ~is still provable. Half of 2 follows from 
1. The other half is obvious when we notice that any proof of ~ from I’ uses 
only a finite number of premisses from F. Proposition 1.3 is also quite simple: 
if ~is provable from premisses in A, and each premiss in A is provable from 
premisses in I, then vis provable from premisses in I. 

We now introduce a formal axiomatic theory L for the propositional calculus. 


1. The symbols of L are —, 5, (, ), and the letters A; with positive integers 
i as subscripts: A,, A;, A3, .... The symbols = and = are called primitive 
connectives, and the letters A; are called statement letters. 


2. a. All statement letters are wfs. 


b. If B and C are wfs, then so are (=B) and (B >C).* Thus, a wf of L 
is just a statement form built up from the statement letters A; by 
means of the connectives 7 and >. 


3. If.4,7,and vare wfs of L, then the following are axioms of L: 
(A) (47> (7>7)) 
(A2) (47> (497) > (43) > (47) 
(A3) (G7) >”) SG) 3A >A) 


* To be precise, we should add the so called extremal clause: (c) an expression is a wf if and 
only if it can be shown to be a wf on the basis of clauses (a) and (b). This can be made rigorous 
using as a model the definition of statement form in the footnote on page 4. 
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4. The only rule of inference of L is modus ponens: vis a direct consequence 
of wand (47> 7). We shall abbreviate applications of this rule by MP* 


We shall use our conventions for eliminating parentheses. 

Notice that the infinite set of axioms of L is given by means of three axiom 
schemas (A1)-(A3), with each schema standing for an infinite number of axi- 
oms. One can easily check for any given wf whether or not it is an axiom; 
therefore, L is axiomatic. In setting up the system L, it is our intention to 
obtain as theorems precisely the class of all tautologies. 

We introduce other connectives by definition: 


(D1) (#7) for «(4 >-7) 
(D2) (4v 7) for (27)> 7 
(D3) (4 @7@) for (4>¢)a(e> 2) 
The meaning of (D1), for example, is that, for any wfs .vand 7, “(7A 7)” is an 


abbreviation for “a(7> 77)”. 


Lemma 1.8; +,.7> .#for all wfs .4 


Proof 


We shall construct a proof in Lof .7> 7. 


lL (43 (42 +A>”*)> Instance of axiom schema (A2) 
(F> (4> A) > (F4> DY) 


* A common English synonym for modus ponens is the detachment rule. 

* The word “proof” is used in two distinct senses. First, it has a precise meaning defined above as 
a certain kind of finite sequence of wfs of L. However, in another sense, it also designates certain 
sequences of the English language (supplemented by various technical terms) that are supposed to 
serve as an argument justifying some assertion about the language L (or other formal theories). In 
general, the language we are studying (in this case, L) is called the object language, while the language 
in which we formulate and prove statements about the object language is called the metalanguage. 
The metalanguage might also be formalized and made the subject of study, which we would carry 
out in a metametalanguage, and so on. However, we shall use the English language as our (unfor- 
malized) metalanguage, although, for a substantial part of this book, we use only a mathematically 
weak portion of the English language. The contrast between object language and metalanguage 
is also present in the study of a foreign language; for example, in a Sanskrit class, Sanskrit is the 
object language, while the metalanguage, the language we use, is English. The distinction between 
proof and metaproof (i.e., a proof in the metalanguage) leads to a distinction between theorems of 
the object language and metatheorems of the metalanguage. To avoid confusion, we generally use 
“proposition” instead of “metatheorem.” The word “metamathematics” refers to the study of logi- 
cal and mathematical object languages; sometimes the word is restricted to those investigations 
that use what appear to the metamathematician to be constructive (or so-called finitary) methods. 
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2. B2((F2>A>A Axiom schema (A1) 
3. (43 (45 A)>(4%>.% ~~ From1and2by MP 
4, BG>(F> 2) Axiom schema (A1) 
5. @> @ From 3 and 4 by MP* 
Exercise 
1.47 Prove: 


a kL G72 APF 

b 234753 FR, GF 

Cc 42 (73 7b, v= (4%> 7%) 
d. FL Ars 74> (4>%7 


In mathematical arguments, one often proves a statement 7 on the assump- 
tion of some other statement 4 and then concludes that “if .4% then 7” is 
true. This procedure is justified for the system L by the following theorem. 


Proposition 1.9 (Deduction Theorem)* 


If [is aset of wfs and .vand vare wfs, and I, .7F v, then .7> ~ In par- 
ticular, if 7 7, then }_7=> ¢(Herbrand, 1930). 


Proof 


Let 4, ..., 4, be a proof of v from T vu {4}, where ~, is 7. Let us prove, by induction 
on j, that 0F .7= ¢ for 1<j <n. First of all, 4 must be either in T’ or an axiom of 
Lor .#itself. By axiom schema (Al), 4 > (7> 4) is an axiom. Hence, in the first 
two cases, by MP, TF .7=> «. For the third case, when % is .4, we have k:7> 4 by 
Lemma 1.8, and, therefore, C+ .7=> 4. This takes care of the case j = 1. Assume 
now that F .4=> ~, for all k <j. Either 7,is an axiom, or is in; or is 4, or 7 fol- 
lows by modus ponens from some % and ¥,,, where f < j,m <j, and ~,, has the form 
@=>%In the first three cases, 0 47> jasin the case j = 1 above. In the last case, 
we have, by inductive hypothesis, CF .7=> 4 and .7=> (4 = «). But, by axiom 
schema (A2),F (47> (4 => 4) > (4> 9 > (4> *). Hence, by MPF (4 4) > 
(4> 7), and, again by MP, F .4=> ~ Thus, the proof by induction is complete. 
The case j = nis the desired result. [Notice that, given a deduction of from T and 


* The reader should not be discouraged by the apparently unmotivated step 1 of the proof. As 
in most proofs, we actually begin with the desired result, .7> .4, and then look for an appro- 
priate axiom that may lead by MP to that result. A mixture of ingenuity and experimentation 
leads to a suitable instance of axiom (A2). 

* For the remainder of the chapter, unless something is said to the contrary, we shall omit the 
subscript L in F,. In addition, we shall use [, 7+ /to stand forlu {4} ~. In general, we let 
[TL A, ... 4, estand forl u (A, ...,A}r 7 
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%, the proof just given enables us to construct a deduction of .7=> ~ from I Also 
note that axiom schema (A3) was not used in proving the deduction theorem.] 


Corollary 1.10 


a B22GCCDPIF BGPP 
b. 43 (723 YY, Ch F> F 


Proof 
For part (a): 
1 #@>¢ Hyp (abbreviation for “hypothesis”) 
2. €>@9 Hyp 
3. @B Hyp 
4. ¢ 1,3, MP 
5. 9 2,4, MP 


Thus,.7> 4,7=> 9,.4 9.580, by the deduction theorem,.7> 7,75 YF .7> 7%. 
To prove (b), use the deduction theorem. 


Lemma 1.11 


For any wfs .vand ¢, the following wfs are theorems of L. 


a 47> f 
b 437 

cc AF>(2>°7 

d. Gra734>(%>7 

e (4390>0673>7%) 

f. 2 (Area -(%> %) 

g (429)3 673979 


Proof 


a Fuge>Z 
Ll G73 4A 2 (A9R74> %) Axiom (A3) 
2. AZD>APe Lemma 1.8* 


* Instead of writing a complete proof of =.7=> —.4, we simply cite Lemma 1.8. In this way, we indi- 
cate how the proof of --.7> .7could be written if we wished to take the time and space to do so. 
This is, of course, nothing more than the ordinary application of previously proved theorems. 
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C2>MmA>F 
AB> (AFZ> 7) 


._ G> fF 


O77) 2 (O79 D2 


G> wg 


. mF > 1G 


L(A Z> 22> 


B> (HWA > Y) 


>A 


- WA 


B 


Z>0%> %) 


.AZ> Fr > 7%) 
_Ww> Zs 


. I> 1G 


. F974) > (C%> *A> 9) 
673 A>7 


» 6 


12. 
d. K( 
1. 


4 


2 
3 
5 
6. 
z. 
F ( 
1 
2 
3 


4. 


3 
4 
5 
kK 
1 
2 
3 
4 
5 
c FAZ>(4>7 
1 
2 
3 
4 
5 
6 
7. 
8 
9 


. 1% BEE 
. AZ- F>¢ 


buZ>(4>7) 
i> 74) > (47> 7) 
Ww> 74 
G7s74 3 (O73 A209 
B>Or> 4%) 

Gres A>- 

> 

W>ABE BSS? 
Fars 34> (%>°7 
B>)>(O7> 7%) 
> 


_as> ZF 


. TAS 7 


C> 7% 
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1, 2, Corollary 1.10(b) 
Axiom (A1) 
3, 4, Corollary 1.10(a) 


Axiom (A3) 

Part (a) 

1,2, MP 

Axiom (A1) 

3, 4, Corollary 1.10(a) 


Hyp 

Hyp 

Axiom (A1) 

Axiom (A1) 

2,3, MP 

1,4, MP 

Axiom (A3) 

6, 7, MP 

5, 8, MP 

1-9 

10, deduction theorem 
11, deduction theorem 


Hyp 

Axiom (A3) 

Axiom (A1) 

1, 2, MP 

3, 4, Corollary 1.10(a) 
1-5 

6, deduction theorem 


Hyp 
Part (a) 


1, 2, Corollary 1.10(a) 
Part (b) 
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Ce ONAL 


. amF> ¢ 


. (774> 47) > G7> 7) 


Ww>74Z 
B> CR ACSA 


. E(Z3>9)> 07574) 
fF 


22> O73 AW(4%> 7) 


33 


3, 4, Corollary 1.10(a) 
Part (d) 

5, 6, MP 

1-7 

8, deduction theorem 


Clearly, .4,.7= “-vby MP. Hence, .7> ((7=> 7) > 7) by two uses of 
the deduction theorem. Now, by (e), K( 75 7) > 7) > G73 74> 7). 
Hence, by Corollary 1.10(a), 47> (-7> (47> 7). 


g K(4e)3(079930 


1 7>¢ 

2. AZ>~% 

3. (729) > (70> 7%) 

4. a7 > 77 

5. (AZZ> A> OTR 

6.47357 

Z ve)3 (37959 

8 rSrAa>7 

9. ¢ 

10. 43>4,727> 7K ¢ 

1. 7a 7F RFR Y)>7 

12. Hizey) s>@739>0 
Exercises 


Hyp 


Hyp 

Part (e) 

1, 3, MP 

Part (e) 

2,5, MP 

Axiom (A3) 

6, 7, MP 

4,8, MP 

1-9 

10, deduction theorem 
11, deduction theorem 


1.48 Show that the following wfs are theorems of L. 


1.49 


a. 


Pew oan 


B>(AVA 
B> (CV Y) 
ON BANC 
BNCDP> DB 
BNC=> 


(43 D2 (3 D2 (4V ESD) 


(Ca ee 
R>(6>(4AY) 


Exhibit a complete proof in L of Lemma 1.11(c). [Hint: Apply the proce- 
dure used in the proof of the deduction theorem to the demonstration 
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given earlier of Lemma 1.11(c).] Greater fondness for the deduction the- 
orem will result if the reader tries to prove all of Lemma 1.11 without 
using the deduction theorem. 


It is our purpose to show that a wf of L is a theorem of L if and only if it is 
a tautology. Half of this is very easy. 


Proposition 1.12 


Every theorem of L is a tautology. 


Proof 


As an exercise, verify that all the axioms of L are tautologies. By Proposition 
1.2, modus ponens leads from tautologies to other tautologies. Hence, every 
theorem of L is a tautology. 

The following lemma is to be used in the proof that every tautology is a 
theorem of L. 


Lemma 1.13 


Let 7be a wf and let B,, ..., B, be the statement letters that occur in .7. For 
a given assignment of truth values to By, ..., B,, let B; be B if B takes the 
value T; and let Bi be +B; if B; takes the value F. Let 4’ be vif .4 takes the 
value T under the assignment, and let .7’ be +.7if B takes the value F. Then 
Bi, ..., Be KZ". 

For example, let .7be ~(-A, > As). Then for each row of the truth table 


A> As (AA > As) 


tony 
Mody 
Hn 


Lemma 1.13 asserts a corresponding deducibility relation. For instance, cor- 
responding to the third row there is A,, -A; ~7(-A, > As), and to the fourth 
row, 7A), 7A; k 7(7A, => As). 


Proof 


The proof is by induction on the number n of occurrences of = and => in 7. 
(We assume .4 written without abbreviations.) If n = 0, .7is just a statement 
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letter B,, and then the lemma reduces to B, + B, and =B, k =B,. Assume now 
that the lemma holds for all j < n. 


Case 1: Zis a7. Then 7 has fewer than 1 occurrences of = and >. 


Subcase 1a: Let take the value T under the given truth value assignment. 
Then 4 takes the value F. So, “’ is ~and 4 is +4. By the inductive hypoth- 
esis applied to 7, we have Bi, ..., By +7. Then, by Lemma 1.11(b) and MP, 
Bi, ..., By Lane. But ¢is 2’. 

Subcase 1b: Let 7 take the value F. Then .#takes the value T. So, 7’ is >vand 
is .4, By inductive hypothesis, Bi, ..., Be Env. But 77 is 7". 


Case 2: vis 7=> 97. Then vand 7have fewer occurrences of = and => than -%. 
So, by inductive hypothesis, B,,..., By H>7' and By, ..., By F797". 

Subcase 2a: 7 takes the value FE Then “takes the value T. So, 7’ is 3 vand 7 
is 7. Hence, Bi,..., By }>z. By Lemma 1.11) and MP, Bi,..., Bk 7 > 7. 
But 7> 7Jis 7%’. 

Subcase 2b: 7 takes the value T. Then .# takes the value T. So, 7’ is and 7 
is 7. Hence, By, ..., Be + 7. Then, by axiom (Al) and MP, Bi,..., Bk} 7 > 7. 
But 7> 7Zis 7%. 

Subcase 2c: takes the value T and “takes the value F. Then .#takes the value F. 
So, “is 7%, 7’ is 79,and .# is .% Therefore, B;,..., B. H ~ and By,..., Be Enz. 
Hence, by Lemma 1.11(f) and MP, Bi, ..., By /a(7 => 7). But (7 > 7) is’. 


Proposition 1.14 (Completeness Theorem) 
Ifa wf .vof Lisa tautology, then it is a theorem of L. 


Proof 


(Kalmar, 1935) Assume .7 is a tautology, and let B,, ..., B, be the statement 
letters in B. For any truth value assignment to By, ..., B,, we have, by Lemma 
1.13, Bi, ..., By +. %.(' is Zbecause Z always takes the value T.) Hence, when 
B; is given the value T, we obtain Bi, ..., By1, B, + .z, and, when B, is given 
the value F, we obtain B,, ..., By, -B, + %. So, by the deduction theorem, 
Bi, ..., Bea EB, => vand By, ..., Bea, + -B, > 7 Then by Lemma 1.11(g) and 
MP, B,..., Brak &% Similarly, B,,; may be chosen to be T or F and, again 
applying the deduction theorem, Lemma 1.11(g) and MP, we can eliminate 
B,_1 just as we eliminated B;. After k such steps, we finally obtain +. 


Corollary 1.15 


If ~ is an expression involving the signs =, >, A, V, and © that is an abbrevia- 
tion for a wf of L, then vis a tautology if and only if vis a theorem of L. 
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Proof 


In definitions (D1)-(D3), the abbreviating formulas replace wfs to which 
they are logically equivalent. Hence, by Proposition 1.4, .zand ~are logically 
equivalent, and vis a tautology if and only if B is a tautology. The corollary 
now follows from Propositions 1.12 and 1.14. 


Corollary 1.16 


The system L is consistent; that is, there is no wf .7such that both .vand 77 
are theorems of L. 


Proof 


By Proposition 1.12, every theorem of L is a tautology. The negation of a 
tautology cannot be a tautology and, therefore, it is impossible for both .4 
and — 7 to be theorems of L. 

Notice that L is consistent if and only if not all wfs of L are theorems. In fact, 
if L is consistent, then there are wfs that are not theorems (e.g., the negations of 
theorems). On the other hand, by Lemma 1.11@), Fk , 77> (47> “), and so, if L 
were inconsistent, that is, if some wf .7and its negation —.7 were provable, then 
by MP any wf 7 would be provable. (This equivalence holds for any theory that 
has modus ponens as a rule of inference and in which Lemma 1.11(¢) is provable.) 
A theory in which not all wfs are theorems is said to be absolutely consistent, and 
this definition is applicable even to theories that do not contain a negation sign. 


Exercise 


1.50 Let be a statement form that is not a tautology. Let L* be the formal 
theory obtained from L by adding as new axioms all wfs obtainable 
from “by substituting arbitrary statement forms for the statement let- 
ters in .4, with the same form being substituted for all occurrences of a 
statement letter. Show that L* is inconsistent. 


1.5 Independence: Many-Valued Logics 


A subset Y of the set of axioms of a theory is said to be independent if some 
wf in Y cannot be proved by means of the rules of inference from the set of 
those axioms not in Y. 


Proposition 1.17 


Each of the axiom schemas (A1)-(A3) is independent. 
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Proof 


To prove the independence of axiom schema (A1), consider the following 
tables: 


A -7A A B A=>B 
0 1 0 O 0 
1 1 1 0 2 
2 0 2 O 0 

0 1 2 

1 1 2 

2 1 0 

0 2 2 

1 2 0 

2 2 0 


For any assignment of the values 0, 1, and 2 to the statement letters of a wf 
4, these tables determine a corresponding value of .4. If .7 always takes the 
value 0, .7is called select. Modus ponens preserves selectness, since it is easy 
to check that, if .7and .7 => ~are select, so is ~ One can also verify that all 
instances of axiom schemas (A2) and (A3) are select. Hence, any wf deriv- 
able from (A2) and (A3) by modus ponens is select. However, A; > (A, > A), 
which is an instance of (A1), is not select, since it takes the value 2 when A, 
is 1 and A, is 2. 

To prove the independence of axiom schema (A2), consider the following 
tables: 


A -7A A B A=>B 
0 1 0 0 0 
1 0 1 0 0 
2 1 2 0 0 

0 1 2 

1 1 2 

2 1 0 

0 2 1 

1 2 0 

2. 2 0 


Let us call a wf that always takes the value 0 according to these tables 
grotesque. Modus ponens preserves grotesqueness and it is easy to verify 
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that all instances of (Al) and (A3) are grotesque. However, the instance 
(A, > (A, > A;)) > (A, > A,) > (A, > A3)) of (A2) takes the value 2 when A, 
is 0, A, is 0, and A; is 1 and, therefore, is not grotesque. 

The following argument proves the independence of (A3). Let us call a wf 

z super if the wf h(.7) obtained by erasing all negation signs in .7is a tautol- 
ogy. Each instance of axiom schemas (A1) and (A2) is super. Also, modus 
ponens preserves the property of being super; for if h(v=> 7) and h(.¥) are 
tautologies, then h(7) is a tautology. (Just note that h(v=> 7) is h(v) => h(v) 
and use Proposition 1.2.) Hence, every wf .7 derivable from (A1) and (A2) by 
modus ponens is super. But h(-A, > 7A,) > (CA, > A) > A))) is (A, > Ay) > 
((A, > A,) > A,), which is not a tautology. Therefore, (A, > 7A,) > (7A; > 
A,) > Aj), an instance of (A3), is not super and is thereby not derivable from 
(Al) and (A2) by modus ponens. 

The idea used in the proof of the independence of axiom schemas (A1) 
and (A2) may be generalized to the notion of a many-valued logic. Select a 
positive integer n, call the numbers 0, 1, ..., 1 truth values, and choose a num- 
ber m such that 0 < m <n. The numbers 0, 1, ..., m are called designated values. 
Take a finite number of “truth tables” representing functions from sets of 
the form {0, 1, ..., n}* into {0, 1, ..., }. For each truth table, introduces a sign, 
called the corresponding connective. Using these connectives and state- 
ment letters, we may construct “statement forms,” and every such state- 
ment form containing j distinct letters determines a “truth function” from 
{0, 1, ..., n}/ into {0, 1, ..., n}. A statement form whose corresponding truth 
function takes only designated values is said to be exceptional. The numbers 
m and n and the basic truth tables are said to define a (finite) many-valued 
logic M. A formal theory involving statement letters and the connectives of 
M is said to be suitable for M if and only if the theorems of the theory coin- 
cide with the exceptional statement forms of M. All these notions obviously 
can be generalized to the case of an infinite number of truth values. If n = 1 
and m = 0 and the truth tables are those given for = and => in Section 1.1, 
then the corresponding two-valued logic is that studied in this chapter. The 
exceptional wfs in this case were called tautologies. The system L is suit- 
able for this logic, as proved in Propositions 1.12 and 1.14. In the proofs of 
the independence of axiom schemas (A1) and (A2), two three-valued logics 
were used. 


Exercises 


1.51 Prove the independence of axiom schema (A3) by constructing appro- 
priate “truth tables” for = and >. 


1.52 (McKinsey and Tarski, 1948) Consider the axiomatic theory P in which 
there is exactly one binary connective +, the only rule of inference is 
modus ponens (that is, follows from .7 and .7* 7), and the axioms 
are all wfs of the form .7+ .4. Show that P is not suitable for any (finite) 
many-valued logic. 


The Propositional Calculus 39 


1.53 For any (finite) many-valued logic M, prove that there is an axiomatic 
theory suitable for M. 


Further information about many-valued logics can be found in Rosser 
and Turquette (1952), Rescher (1969), Bolc and Borowik (1992), and 
Malinowski (1993). 


1.6 Other Axiomatizations 


Although the axiom system L is quite simple, there are many other systems 
that would do as well. We can use, instead of = and =, any collection of 
primitive connectives as long as these are adequate for the definition of all 
other truth-functional connectives. 


Examples 


L,: V and ~ are the primitive connectives. We use .7=> ¢as an abbreviation for 
a.2N 7 We have four axiom schemas: (1).7V .7> .% (2). 27> .4V 7% (3). 2V 77> 
¢V 4, and (4) (7=> 2) > (4V 7=>.4V 7). The only rule of inference is modus 
ponens. Here and below we use the usual rules for eliminating parentheses. 
This system is developed in Hilbert and Ackermann (1950). 


L,: A and - are the primitive connectives. .7 > 7 is an abbreviation for 
3(7A 77). There are three axiom schemas: (1) 47> (7A .%); (2) FN 7 > & 
and (3) (47> 7) > (A(“A Y) > (7A .%)). Modus ponens is the only rule of 
inference. Consult Rosser (1953) for a detailed study. 


L,: This is just like our original system L except that, instead of the axiom 
schemas (A1)-(A3), we have three specific axioms: (1) A, >(A, > Ay); (2) (A, 
>(A, => As) >((A; > Ay) >(A1 > As); and (3) (VA, > 7A)) S(CA, => A;) > A)). 
In addition to modus ponens, we have a substitution rule: we may substitute 
any wf for all occurrences of a statement letter in a given wf. 


L,: The primitive connectives are >, A, V, and >. Modus ponens is the only rule, 
and we have 10 axiom schemas: (1) .7> (7> 4); (2) (4> (7> A) > (47> 7 
> (%7> 2); (3) FA C> (4) ZN 7S GB) F> (7 > (FAY); (6). F> (AV 7); 
(7) 7> (AV 7); (8) (473 YD > (7S DF(AV 7S Y); 9) (4S A> ((4P-7W) 
=> 4); and (10) -4.7=> .% This system is discussed in Kleene (1952). 
Axiomatizations can be found for the propositional calculus that contain 
only one axiom schema. For example, if = and => are the primitive connec- 
tives and modus ponens the only rule of inference, then the axiom schema 


(g27)=-73-%))=2)= 7 [7 = 4)>(¢=2)| 


40 Introduction to Mathematical Logic 


is sufficient (Meredith, 1953). Another single-axiom formulation, due to 
Nicod (1917), uses only alternative denial |. Its rule of inference is: 7 follows 
from .% |(7 |” and .% and its axiom schema is 


(41 | 2) [Ke 


(4 


MM | 4 | AG 


7) 


Further information, including historical background, may be found 
in Church (1956) and in a paper by Lukasiewicz and Tarski in Tarski 
(1956, IV). 


Exercises 

1.54 (Hilbert and Ackermann, 1950) Prove the following results about the 
theory L,. 
a @> Fy IV G> ING 

bFi(4eonre(%2 4>(7>°%) 

> @ B> Ey IPG 

17> Zlie,F yy 4ZV A) 

BN AB 

B>AAD 

AA>(F%> 0) 

BV(CVN YS (KV(GV AVA 

(ZV (ZV DV AZ CV(ANV FZ) 

BN(CN DP) > ON (AN FP) 

(43> (> DY) > (FS (4> Y) 

U(I> A> (43 93> (7> 7) 

ek 

B>(C> YY), FOR YY AP FD 

If) 4 y, 4 thenDF ,,.7=> 7(Deduction theorem) 

C> B,7C> Bry x 


mr 


TT TTTT TTT 


9G 8S Bg ee 


F ,, zif and only if .vis a tautology. 


gi 
ol 
ron 


Rosser, 1953) Prove the following facts about the theory L). 
BD ,062> DER yAGIA A) 

FiyAGZA A) 

bs 7 

Fp (4A) > (47> 7/%) 

+ 1p >We 

Fp(4e90>Cr> 7%) 


moan & p 
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WAS AW ys BF 

B> Eh 1g IN BD ENG 

BPP DIP EV YZ ADE 

by A>B 

Fp FRCP CAN ZB 

B>C,6> DR 12 BP DI 

B>C, D> OF ig BRI ENE 

C> DR yp FNCP BN 

Fy(4e(ve DYP(4VAOPFY” 

Fp (4a) > 2) > (43 ("> 9%) 

BDC, BP (CP DK yy AP F 

Fi 22 (72> FAA 

Fy 43 (¢>.%) 

Ifl) Fy 7% thenT Ff ,,.7= ~ (Deduction theorem) 

a ee a 

B> EC ABD CF 2G 

F ,,.#if and only if .vis a tautology. 

1.56 Show that the theory L, has the same theorems as the theory L. 

1.57 (Kleene, 1952) Derive the following facts about the theory L,. 
bry A>F 

If) 4,47, then »4.7=> ~(deduction theorem) 

B>ECEC> DP 14. 4> G 

buys os Ors74) 


, ABE 14 F 


Pp ga 


me 


SA ee ee te eB AB? ge ee 


+ 14 Fe 
byyrAZ>(4%>7 

Fg. 4> 06> (47> 7) 
kK 

kK 


PM moan Pp 
SS 


14 AS (AT > WAV 7) 
Gres A339 % 
k. / ,, #if and only if vis a tautology. 


me 


1.58 Consider the following axiomatization of the propositional calculus 
(due to Lukasiewicz). ” has the same wfs as our system L. Its only rule 
of inference is modus ponens. Its axiom schemas are: 


a F7> APF 
b 23> 07>7) 
& (929373 9>(49 9) 
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Prove that a wf vof » is provable in if and only if vis a tautology. 
[Hint: Show that Land » have the same theorems. However, remember 
that none of the results proved about L (such as Propositions 1.8-1.13) 
automatically carries over to ”. In particular, the deduction theorem is 
not available until it is proved for .] 


1.59 Show that axiom schema (A3) of L can be replaced by the schema 
47> 77) > (7 > .%) without altering the class of theorems. 


1.60 If axiom schema (10) of L, is replaced by the schema (10) 7.75 (47> 7), 
then the new system L, is called the intuitionistic propositional calcu- 
lus.* Prove the following results about L,. 


a. Consider an (n + 1)-valued logic with these connectives: =.7 is 0 
when is n, and otherwise it is n; .7 A ~has the maximum of the 
values of .4and 7, whereas .4 V “has the minimum of these values; 
and #> vis Oif #hasa value not less than that of 7, and otherwise 
it has the same value as ~. If we take 0 as the only designated value, 
all theorems of L, are exceptional. 


b. A, V7A,and =-7A, => A, are not theorems of L. 
c. For any m, the wf 


(Ay = A)v Fhe v(Ai <=> An)v (Az = A3)v wie 
v(Ap <= An)V S39 V (An eA, 


is not a theorem of L, 

d. (Gédel, 1933) L, is not suitable for any finite many-valued logic. 
e. ip If) ay, 4 thenl + ,,;.47> 7(deduction theorem) 

i. 43¢6,06> Py 4? F 

iii Fy soy 

iv. by (4en> 072374 

v. Fy 43> 07>7) 

vi. Eyam Z> 2) 


Vil. a7( B=> )y a4 BE LI 7 


vill, Fypam7> 77 


f>. +,,7>if and only if vis a tautology. 


* The principal origin of intuitionistic logic was L.EJ. Brouwer’s belief that classical logic is 
wrong. According to Brouwer, 4 V 7 is proved only when a proof of 4 or a proof of 7 has 
been found. As a consequence, various tautologies, such as 7 Vv +4, are not generally accept- 
able. For further information, consult Brouwer (1976), Heyting (1956), Kleene (1952), Troelstra 
(1969), and Dummett (1977). Jaskowski (1936) showed that L; is suitable for a many-valued 
logic with denumerably many values. 
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g. -,,7if and only if ais a tautology. 


h.? If 4 has v and ~as its only connectives, then + ,; vif and only if 7 
is a tautology. 


1.614 Let and “be in the relation R if and only if + , 7 ~ Show that R is 
an equivalence relation. Given equivalence classes [7] and [7], let [4] U 
[I =[4V 4, -4)n[J =[4A 4, and [7] = [4]. Show that the equiva- 
lence classes under R form a Boolean algebra with respect to n, U, and -, 
called the Lindenbaum algebra L* determined by L. The element 0 of L* is 
the equivalence class consisting of all contradictions (i.e., negations of 
tautologies). The unit element 1 of L‘ is the equivalence class consisting 
of all tautologies. Notice that + ,.7=> vif and only if [4] < [7] in L4 and 
that + , 4 vif and only if [4] = [7]. Show that a Boolean function f 
(built up from variables, 0, and 1, using U, nN, and ~) is equal to the con- 
stant function 1 in all Boolean algebras if and only if +, f*, where f * is 
obtained from f by changing u, n, ~, 0, and 1 to Vv, A, 7, A; A 7A,, and 
A, V 7A,, respectively. 


2 


First-Order Logic and Model Theory 


2.1 Quantifiers 


There are various kinds of logical inference that cannot be justified on the 
basis of the propositional calculus; for example: 


1. Any friend of Martin is a friend of John. 
Peter is not John’s friend. 
Hence, Peter is not Martin’s friend. 

2. All human beings are rational. 
Some animals are human beings. 
Hence, some animals are rational. 

3. The successor of an even integer is odd. 
2 is an even integer. 
Hence, the successor of 2 is odd. 


The correctness of these inferences rests not only upon the meanings of the 
truth-functional connectives, but also upon the meaning of such expressions 
as “any,” “all,” and “some,” and other linguistic constructions. 

In order to make the structure of complex sentences more transparent, it 
is convenient to introduce special notation to represent frequently occur- 
ring expressions. If P(x) asserts that x has the property P, then (Vx)P(x) means 
that property P holds for all x or, in other words, that everything has the 
property P. On the other hand, (4x)P(x) means that some x has the property 
P—that is, that there is at least one object having the property P. In (Vx)P(x), 
“(Wx)” is called a universal quantifier; in (Ax)P(x), “(Ax)” is called an existential 
quantifier. The study of quantifiers and related concepts is the principal sub- 
ject of this chapter. 
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Examples 
1’. Inference 1 above can be represented symbolically: 


(vx)(F(x,m) => F(x, j)) 


—F(p,j) 
—F(p,m) 


Here, F(x, y) means that x is a friend of y, while m, j, and p denote 
Martin, John, and Peter, respectively. The horizontal line above 
“SF(p, m)” stands for “hence” or “therefore.” 


2’. Inference 2 becomes: 


(Vx)(H(x) = R(x) 
(Ax)(A(x) A A(x) 
(Ax)(A(x) 4 R(x)) 


Here, H, R, and A designate the properties of being human, rational, 
and an animal, respectively. 


3’. Inference 3 can be symbolized as follows: 


(Vx)((x) A E(x) => D(s(x))) 
1(b) A E(b) 
D(s(b)) 


Here, I, E, and D designate respectively the properties of being an 
integer, even, and odd; s(x) denotes the successor of x; and b denotes 
the integer 2. 


Notice that the validity of these inferences does not depend upon the par- 
ticular meanings of FE, m, j, p, H, R, A, I, E, D, s, and b. 

Just as statement forms were used to indicate logical structure dependent 
upon the logical connectives, so also the form of inferences involving quan- 
tifiers, such as inferences 1-3, can be represented abstractly, as in 1’-3’. For 
this purpose, we shall use commas, parentheses, the symbols — and > of the 
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propositional calculus, the universal quantifier symbol V, and the following 
groups of symbols: 


Individual variables: x1, X5, ..., Xj) -+- 
Individual constants: 4), dy, ..., Ap) -+- 
Predicate letters: A; (1 and k are any positive integers) 
Function letters: f;’ (n and k are any positive integers) 


The positive integer n that is a superscript of a predicate letter A; or of a 
function letter f;’ indicates the number of arguments, whereas the subscript 
k is just an indexing number to distinguish different predicate or function 
letters with the same number of arguments.” 

In the preceding examples, x plays the role of an individual variable; m, 
jp, and b play the role of individual constants; F is a binary predicate letter 
(ie., a predicate letter with two arguments); H, R, A, I, E, and D are monadic 
predicate letters (i.e., predicate letters with one argument); and s is a function 
letter with one argument. 

The function letters applied to the variables and individual constants gen- 
erate the terms: 


1. Variables and individual constants are terms. 


2. If f;’ is a function letter and t,, ty, ...,t, are terms, then fi’ (t, to, ..., tn) 
is a term. 


3. An expression is a term only if it can be shown to be a term on the 
basis of conditions 1 and 2. 


Terms correspond to what in ordinary languages are nouns and noun 
phrases—for example, “two,” “two plus three,” and “two plus x.” 

The predicate letters applied to terms yield the atomic formulas; that is, if Ay 
is a predicate letter and t,, t,, ..., t, are terms, then Aj (ty, to,...,t,) is an atomic 
formula. 

The well-formed formulas (wfs) of quantification theory are defined as follows: 


1. Every atomic formula is a wf. 
2. If zand ~are wfs and y is a variable, then (-.4), (7> 7), and ((Vy)./) 
are wfs. 


3. An expression is a wf only if it can be shown to be a wf on the basis 
of conditions 1 and 2. 


* For example, in arithmetic both addition and multiplication take two arguments. So, we 
would use one function letter, say fi, for addition, and a different function letter, say if: for 
multiplication. 
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In (Vy).4), “4” is called the scope of the quantifier “(Vy).” Notice that .7need 
not contain the variable y. In that case, we understand ((Vy).4) to mean the 
same thing as 7. 

The expressions (7A 7), (7V 7), and (7 7) are defined as in system L (see 
page 29). It was unnecessary for us to use the symbol J as a primitive symbol 
because we can define existential quantification as follows: 


((Ax).7) stands for (-=((Vx)(-=.7))) 


This definition is faithful to the meaning of the quantifiers: .7(x) is true for 
some x if and only if it is not the case that .4(x) is false for all x.* 


2.1.1 Parentheses 


The same conventions as made in Chapter 1 (page 11) about the omission of 
parentheses are made here, with the additional convention that quantifiers 
(vy) and (Ay) rank in strength between -, A, V and >, ©. In other words, 
when we restore parentheses, negations, conjunctions, and disjunctions are 
handled first, then we take care of universal and existential quantifications, 
and then we deal with conditionals and biconditionals. As before, for con- 
nectives of the same kind, we proceed from left to right. For consecutive 
negations and quantifications, we proceed from right to left. 


Examples 


Parentheses are restored in the following steps. 


1. (Wx1)Ai(x1) => Ai(x2,%1) 
((Wx1 At (x1) => At (Xa, 21) 


(Vx) At(%1)) => At(%2,%1)) 
Z. (Wx )AT(%1) V At (X2, *1) 


(Vx, Ai(x1) Vv Al (x2, %1)) 


((¥x1)(Ai(%1)) v Ai (%2, x1))) 


* We could have taken J as primitive and then defined ((Vx).4) as an abbreviation for (=((4x) 
(.y))), since .7(x) is true for all x if and only if it is not the case that .4(x) is false for some x. 
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3. (Wx JA(Sx2)AT(X1, X2) 
(Vx, )A((Ax2)AT (x1, X2)) 


(Vx; \A((Ax.)AT(x1, X2))) 


((Vx1)(HG x2) Ai (x1, X2)))) 


Exercises 


2.1 Restore parentheses to the following. 
(Wx )Ar(%1) AAT (Xp) 
(Vx2)Ai (x2) oS Ai(x2) 
(Wx2)(Ax1)AT (x1, x2) 
(Wx1)(Wx3)(WX4)AT (1) => Al(X2) AAT (%1) 
(Ax1)(VX2)(Sx3)At (m1) V Ax2)A(Vx3) AT (X3,X2) 
(Wx2)AAi(%1) => A? (x1, %1,%2) V (VX )AT(%1) 
(x1 )AT(%1) => (Ax2)AT(X2) => AP (x1, %2) A Al(x2) 
2.2 Eliminate parentheses from the following wfs as far as is possible. 
a. (((Vx1)(Ai(%1) > Ai(%))) v (Ax) Ai(%))) 
b. ((A((Ax2)(Ai(%2) v Ai (a1) <=> At(%2)) 
Cc. (((Vx1)(AAAT(a3)))) > (At (1) > At (%2))) 
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An occurrence of a variable x is said to be bound in a wf .vif either it is the 
occurrence of x ina quantifier “(Vx)” in vor it lies within the scope of a quan- 
tifier “(Vx)” in .7 Otherwise, the occurrence is said to be free in 7. 


Examples 
1. Ai (%1,%2) 
2. At(x1,X2) => (Vxi)Ai (m1) 
3. (Vx1)(AT(x1,X2) => (Wx) At(m1)) 
4, (Ax )AT (x1, X2) 


In Example 1, the single occurrence of x, is free. In Example 2, the occurrence 
of x, in A?(x1,X2) is free, but the second and third occurrences are bound. In 
Example 3, all occurrences of x; are bound, and in Example 4 both occur- 
rences of x, are bound. (Remember that (4x1) A?(x1,xX2) is an abbreviation of 
(Vx) Ai (%1,%2) .) In all four wfs, every occurrence of x, is free. Notice that, 
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as in Example 2, a variable may have both free and bound occurrences in the 
same wf. Also observe that an occurrence of a variable may be bound in some 
wf .7but free in a subformula of .% For example, the first occurrence of x, is free 
in the wf of Example 2 but bound in the larger wf of Example 3. 

A variable is said to be free (bound) in a wf .7if it has a free (bound) occur- 
rence in .4. Thus, a variable may be both free and bound in the same wf; for 
example, x, is free and bound in the wf of Example 2. 


Exercises 


2.3 Pick out the free and bound occurrences of variables in the following wfs. 
a. (Wx3)(((Vx1)AT(X1,X2)) => At (x3, a) 
b. (Wx2)AT(X3,X2) => (Wx3)A7T(X3,X2) 
Cc. ((Wx2)(Ax1) Ai (%1, 2, fi (1, X2))) V (Vx) AP (x2, f(x1)) 


2.4 Indicate the free and bound occurrences of all variables in the wfs of 
Exercises 2.1 and 2.2. 


2.5 Indicate the free and bound variables in the wfs of Exercises 2.1-2.3. 


We shall often indicate that some of the variables x;,,...,x;, are free vari- 
ables in a wf .7 by writing .7as .7(X;,,...,X;,). This does not mean that .7 
contains these variables as free variables, nor does it mean that .7 does not 
contain other free variables. This notation is convenient because we can then 
agree to write as .4(t,, ..., t, the result of substituting in .7the terms t,, ..., t 
for all free occurrences (if any) of Xj, ..., Xj, respectively. 

If 7 isa wf and tis a term, then t is said to be free for x; in 7 if no free occurrence 
of x; in .4 lies within the scope of any quantifier (Vx), where x; is a variable in t. 
This concept of t being free for x;in a wf .7 (x) will have certain technical applica- 
tions later on. It means that, if t is substituted for all free occurrences (if any) of 
x; in .7(x), no occurrence of a variable in t becomes a bound occurrence in 7 (f). 


Examples 


1. The term x, is free for x, in Aj(x;), but x, is not free for x, in (Vx2)A}(X1). 
2. The term fi(x ,X3) is free for x, in (Vx2)At(x1,X2) => Ai(x;) but is not 
free for x, in (Ax3)(VX2)A7(%1,%2) => Ai(X). 


The following facts are obvious. 


1. A term that contains no variables is free for any variable in any wf. 


2. A term t is free for any variable in vif none of the variables of f is 
bound in -7. 


3. x; is free for x; in any wf. 


4. Any term is free for x;in .vif contains no free occurrences of x;. 
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Exercises 


2.6 Is the term fi (%1,x2) free for x, in the following wfs? 
a. Az (x1,X2) => (Vx2)Ar (x2) 
b. ((Vx2) A? (x2,m)) Mi (Ax2) At (1, x2) 
c. (Wx )AT(X1, x2) 
d. (Vx2)Ai(%1, X2) 


e. (Wx2)At (x2) => AZ (x1, X2) 
2.7 Justify facts 1-4 above. 


When English sentences are translated into formulas, certain general guide- 
lines will be useful: 


1. A sentence of the form “All As are Bs” becomes (Vx)(A(x) > B(x)). For 
example, Every mathematician loves music is translated as (Vx)(M(x) > 
L(x), where M(x) means x is a mathematician and L(x) means x loves 
music. 


2. A sentence of the form “Some As are Bs” becomes (4x)(A(x) A B(x). 
For example, Some New Yorkers are friendly becomes (Ax)(N(x) A F(x), 
where N(x) means x is a New Yorker and F(x) means x is friendly. 

3. A sentence of the form “No As are Bs” becomes (Vx)(A(x) > —B(x))* 
For example, No philosopher understands politics becomes (Vx)(P(x) > 
U(x), where P(x) means x is a philosopher and U(x) means x under- 
stands politics. 


Let us consider a more complicated example: Some people respect everyone. 
This can be translated as (Ax)(P(x) A (Vy)(P(y) > R(x, y))), where P(x) means x 
is a person and R(x, y) means x respects y. 

Notice that, in informal discussions, to make formulas easier to read we 
may use lower-case letters u, v, x, y, z instead of our official notation x; for 
individual variables, capital letters A, B, C,... instead of our official notation 
Ax for predicate letters, lower-case letters f, g, h,... instead of our official nota- 
tion f; for function letters, and lower-case letters a, b, c,... instead of our 
official notation a; for individual constants. 


Exercises 


2.8 ‘Translate the following sentences into wfs. 
a. Anyone who is persistent can learn logic. 
b. No politician is honest. 


* As we shall see later, this is equivalent to =(4x)(A(x) A B(x). 
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Not all birds can fly. 

All birds cannot fly. 

x is transcendental only if it is irrational. 
Seniors date only juniors. 

If anyone can solve the problem, Hilary can. 
Nobody loves a loser. 


Nobody in the statistics class is smarter than everyone in the logic 
class. 


John hates all people who do not hate themselves. 


Everyone loves somebody and no one loves everybody, or some- 
body loves everybody and someone loves nobody. 


You can fool some of the people all of the time, and you can fool all 
the people some of the time, but you can’t fool all the people all the 
time. 


Any sets that have the same members are equal. 
Anyone who knows Julia loves her. 


There is no set belonging to precisely those sets that do not belong 
to themselves. 


There is no barber who shaves precisely those men who do not 
shave themselves. 


Translate the following into everyday English. Note that everyday 
English does not use variables. 


a. 


b. 


(Wx)(M(x) A (Wy) ~W(x, y) > U(x), where M(x) means x is a man, W(x, y) 
means x is married to y, and U(x) means x is unhappy. 

(Vx)(V(x) A P(x) > A(x, b)), where V(x) means x is an even integer, P(x) 
means x is a prime integer, A(x, y) means x = y, and b denotes 2. 
ayy) A (Wx)((x) > L(x, y))), where I(y) means y is an integer and 
L(x, y) means x < y. 

In the following wfs, Ai(x) means x is a person and Ai(x,y) means 
x hates y. 

i (Ax)(Ai(x) a (Vy)(Ai(y) > Ai(x,y))) 

ii, (Vx)(Ai(x) > (Vy)(Ai(y) > Ai(x,y))) 
iii, (Ax)(Ai(x) A (Vy)(Ai(y) > (At (x,y) & Ai(y,y)))) 

(Vx)(H(x) > (Ay)4z)CA(y, 2) A (VWu)(P(u, x) = (A, y) V A(u, 2), Where 
A(x) means x is a person, A(u, v) means “u =v,” and P(u, x) means u is 
a parent of x. 


First-Order Logic and Model Theory 53 


2.2 First-Order Languages and Their Interpretations: 
Satisfiability and Truth: Models 


Well-formed formulas have meaning only when an interpretation is given for 
the symbols. We usually are interested in interpreting wfs whose symbols 
come from a specific language. For that reason, we shall define the notion of 
a first-order language.* 


Definition 
A first-order language contains the following symbols. 


a. The propositional connectives = and =>, and the universal quantifier 
symbol V. 
b. Punctuation marks: the left parenthesis “(”, the right parenthesis “)”, 
and the comma “,”* 
. Denumerably many individual variables x, x;, .... 
. A finite or denumerable, possibly empty, set of function letters. 
. A finite or denumerable, possibly empty, set of individual constants. 


. Anonempty set of predicate letters. 


moan 


By a term of we mean a term whose symbols are symbols of 
By a wf of 7 we mean a wf whose symbols are symbols of 


Thus, ina language », some or all of the function letters and individual con- 
stants may be absent, and some (but not all) of the predicate letters may be 
absent.t The individual constants, function letters, and predicate letters of a 
language are called the nonlogical constants of v. Languages are designed 
in accordance with the subject matter we wish to study. A language for arith- 
metic might contain function letters for addition and multiplication and a 


* The adjective “first-order” is used to distinguish the languages we shall study here from 
those in which there are predicates having other predicates or functions as arguments or in 
which predicate quantifiers or function quantifiers are permitted, or both. Most mathemati- 
cal theories can be formalized within first-order languages, although there may be a loss 
of some of the intuitive content of those theories. Second-order languages are discussed in 
the appendix on second-order logic. Examples of higher-order languages are studied also in 
Gédel (1931), Tarski (1933), Church (1940), Leivant (1994), and van Bentham and Doets (1983). 
Differences between first-order and higher-order theories are examined in Corcoran (1980) 
and Shapiro (1991). 

The punctuation marks are not strictly necessary; they can be avoided by redefining the 
notions of term and wf. However, their use makes it easier to read and comprehend formulas. 
+ If there were no predicate letters, there would be no wfs. 
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predicate letter for equality, whereas a language for geometry is likely to 
have predicate letters for equality and the notions of point and line, but no 
function letters at all. 


Definition 


Let » be a first-order language. An interpretation M of » consists of the fol- 
lowing ingredients. 


a. Anonempty set D, called the domain of the interpretation. 

b. For each predicate letter Aj of v, an assignment of an n-place relation 
(A?)™ in D. 

c. For each function letter fj’ of v, an assignment of an n-place opera- 
tion ( my in D (that is, a function from D” into D). 

d. For each individual constant a; of , an assignment of some fixed ele- 
ment (a) of D. 


Given such an interpretation, variables are thought of as ranging over the 
set D, and —, > and quantifiers are given their usual meaning. Remember 
that an n-place relation in D can be thought of as a subset of D", the set of all 
n-tuples of elements of D. For example, if D is the set of human beings, then 
the relation “father of” can be identified with the set of all ordered pairs (x, y) 
such that x is the father of y. 

For a given interpretation of a language , a wf of ~ without free variables 
(called a closed wf or a sentence) represents a proposition that is true or false, 
whereas a wf with free variables may be satisfied (i.e., true) for some values 
in the domain and not satisfied (i.e., false) for the others. 


Examples 
Consider the following wfs: 
1. Ai(%1 i X2) 


2. (Vx) AT(%1, Xo) 
3. (Ax1)(Vx2) Ai (x1, %2) 


Let us take as domain the set of all positive integers and interpret Af(y,z) as 
y < z. Then wf 1 represents the expression “x, < x,”, which is satisfied by all 
the ordered pairs (a, b) of positive integers such that a < b. Wf 2 represents the 
expression “For all positive integers x,, x, < x,”* which is satisfied only by the 
integer 1. Wf 3 is a true sentence asserting that there is a smallest positive integer. 
If we were to take as domain the set of all integers, then wf 3 would be false. 


* In ordinary English, one would say “x, is less than or equal to all positive integers.” 
y ng y q P 8 
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Exercises 


2.10 


2.11 


For the following wfs and for the given interpretations, indicate for 
what values the wfs are satisfied (if they contain free variables) or 
whether they are true or false (if they are closed wfs). 

i. Ai( fi (x1, X2), a) 

ii. AT(x1,X2) > At(x2,%1) 

iii. (Vx1)(VX2)(WX3)(AT(X1,X2) A AT (X2,%3) => Ai (X1,%3)) 

a. The domainis the set of positive integers, A7(y,z)is y =z, fr(y,Z) 
is y-z, and a, is 2. 

b. The domain is the set of integers, Ai(y,2Z) is y=Z, fry,2) is 
y+z,and a, is 0. 

c. The domain is the set of all sets of integers, Ai(y,z) is 
y <z, fily,z)is y Nz, and a, is the empty set @. 

Describe in everyday English the assertions determined by the follow- 

ing wfs and interpretations. 

a. (Wx)(Vy)(Ai(x, y) => (Az)(Al(z) A Ai(x,z) A Ai(z,y)))) Where — the 
domain D is the set of real numbers, A7(x, y) means x < y, and Ai(z) 
means Z is a rational number. 

b. (Vx)(Ai(x) > Gy)(As(y) A A?(y,x))), where D is the set of all days 
and people, Aj(x) means x is a day, A3(y) means y is a sucker, and 
Ait(y,x) means y is born on day x. 


c. (Wx)(Vy)(Ai(x) A Ai(y) > A3(fr(x, y))), where D is the set of integers, 
Aj(x) means x is odd, Az(x) means x is even, and f(x,y) denotes x + y. 


d. For the following wfs, D is the set of all people and Aj(u,v) means u 
loves v. 


i. (Ax)(Vy)(AT(x,y) 

ii. (Vy)(Ax)AT(x,Y) 

iii. (Ax)(Vy)((¥2)(A2(y,z)) > AP(x,y)) 
iv. (Ax)(Vy)AAT (x,y) 

e. (Vx) (Vu) (Vo) (Vw)(E(fu, 1), x) A E(f@, 0), x) A E(fw, w), x) > Eu, 0) Vv 
E(u, w) V E(@, w)), where D is the set of real numbers, E(x, y) means 
x = y, and f denotes the multiplication operation. 

f. At(x1) A (Ax3)(A3(x1,%3) A A3(x3,X2)) where D is the set of people, 
Aj(u) means u is a woman and A3(u,v) means u is a parent of v. 

g. (Wx1)(Vx2)(Ai(x1) A At (x2) => Ad(f2(%1,X2))) where D is the set of real 
numbers, Ai(u) means wu is negative, A3(u) means uw is positive, and 
fr(u,2) is the product of u and v. 


The concepts of satisfiability and truth are intuitively clear, but, following 
Tarski (1936), we also can provide a rigorous definition. Such a definition is 
necessary for carrying out precise proofs of many metamathematical results. 
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Satisfiability will be the fundamental notion, on the basis of which the notion 
of truth will be defined. Moreover, instead of talking about the n-tuples of objects 
that satisfy a wf that has n free variables, it is much more convenient from a tech- 
nical standpoint to deal uniformly with denumerable sequences. What we have 
in mind is that a denumerable sequence s = (81, $3, $3, ...) is to be thought of as 
satisfying a wf 7% that has x;,, Xj,,...,X;, aS free variables (where j, < j< ++: <j,) 
if the n-tuple (5;,, $;),..., 8;,) Satisfies 7in the usual sense. For example, a denu- 
merable sequence (51, 52, 53, ...) of objects in the domain of an interpretation M 
will turn out to satisfy the wf Ai(x2,x5) if and only if the ordered pair, (s;, $s) is 
in the relation (A?) assigned to the predicate letter Aj by the interpretation M. 

Let M be an interpretation of a language and let D be the domain of M. 
Let = be the set of all denumerable sequences of elements of D. For a wf .7 of 

7, we Shall define what it means for a sequence s = ($1, 83, ...) in X to satisfy .7 
in M. As a preliminary step, for a given s in & we shall define a function s* 
that assigns to each term t of ~ an element s*(f) in D. 


1. If tis a variable x,, let s * (f) be s,. 


2. If tis an individual constant a, then s*(f) is the interpretation (a) of 
this constant. 


3. If f' is a function letter, (f’)™ is the corresponding operation in D, 
and t,, ..., t,, are terms, then 


s*(fe (t, a tang, =(fr') vis (41), 7” s* (ty )) 


Intuitively, s*(f) is the element of D obtained by substituting, for each j, a 
name of s; for all occurrences of x; in t and then performing the operations 
of the InIEEp te taon corresponding to the function letters of ¢. For instance, 
if t is fe (x3, fe (x1, a1)) )) and if the interpretation has the set of integers as its 
domain, fi and fi are interpreted as ordinary multiplication and addition, 
respectively, and a, is interpreted as 2, then, for any sequence s = (51, 8, ...) 
of integers, s*(f) is the integer s; - (s, + 2). This is really nothing more than the 
ordinary way of reading mathematical expressions. 

Now we proceed to the definition of satisfaction, which will be an induc- 
tive definition. 


1. If is an atomic wf Aji(t,...,t,) and (Aj)™ is the corresponding 
n-place relation of the interpretation, then a sequence s = (51, $5, ...) 
satisfies vif and only if (Az) (s*(),..., * (t,)) that is, if the n-tuple 
(s*(t,), ..., S*(t,)) is in the relation (Aj’)“* 


* For example, if the domain of the interpretation is the set of real numbers, the interpreta- 
tion of A? is the relation <, and the interpretation of fi is the function e*, then a sequence 
s = (51, 55, ...) of real numbers satisfies At (fi (x2),Xs) if and only if e* <ss. If the domain is the 
set of integers, the interpretation of Ai (x, y, u,v) is x-v =u-y, and the interpretation of a, is 3, 
then a sequence s = (S;, S, ...) of integers satisfies Al (x3, 4, X1,X3) if and only if (53)? = 35). 
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2. s satisfies —. vif and only if s does not satisfy 7. 
3. s satisfies 7=> ~ if and only if s does not satisfy ./or s satisfies v. 


4. s satisfies (Vx)).7 if and only if every sequence that differs from s in at 
most the ith component satisfies .7* 


Intuitively, a sequence s = (1, Sy, ...) satisfies a wf .7 if and only if, when, for 
each i, we replace all free occurrences of x; (if any) in .7 by a symbol repre- 
senting s,, the resulting proposition is true under the given interpretation. 

Now we can define the notions of truth and falsity of wfs for a given 
interpretation. 


Definitions 


1. A wf #is true for the interpretation M (written Fy .4) if and only if 
every sequence in © satisfies .7. 

2. zis said to be false for M if and only if no sequence in & satisfies . 7. 

3. An interpretation M is said to be a model for a set T of wfs if and only 
if every wf inT is true for M. 


The plausibility of our definition of truth will be strengthened by the fact 
that we can derive all of the following expected properties I-X] of the notions 
of truth, falsity, and satisfaction. Proofs that are not explicitly given are left 
to the reader (or may be found in the answer to Exercise 2.12). Most of the 
results are also obvious if one wishes to use only the ordinary intuitive 
understanding of the notions of truth, falsity, and satisfaction. 


I. a. vis false for an interpretation M if and only if >.7is true for M. 
b. vis true for M if and only if +7is false for M. 
II. It is not the case that both Fy, .7and Fy, —.% that is, no wf can be both 
true and false for M. 
Ill. IfFy,.zand Fy. 7=> 7 then Fy, 7 
IV. .7=> cis false for M if and only if Fy .vand Fy 77. 
V. ‘Consider an interpretation M with domain D. 
a. A sequence s satisfies .7A ~ if and only if s satisfies .7 and s satisfies 7. 
b. s satisfies .7 Vv if and only if s satisfies .7or s satisfies 7. 
c. s satisfies 7 ~ if and only if s satisfies both .vand ~ or s satisfies 
neither 4 nor +. 


* In other words, a sequence s = (51, Sy, ..., S;, -..) Satisfies (Vx,).7if and only if, for every element 
c of the domain, the sequence (5}, Sz, ..., C, ...) satisfies .7. Here, (81, 52, ..., ¢, ...) denotes the 
sequence obtained from (s,, 53, ..., 5; ...) by replacing the ith component s; by c. Note also that, 
if s satisfies (Vx;).4, then, as a special case, s satisfies 7. 

+ Remember that 7A 7% 2V 4% 4@ ¢and (Ax).7 are abbreviations for =(47 > 77), 77 > 7, 
(4> AA (e> 4) and 7(Vx;) —.%, respectively. 
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d. s satisfies (4x). 7if and only if there is a sequence s’ that differs from s 
in at most the ith component such that s’ satisfies 7. (In other words 
S = (S1, Sy, ..., $, ...) Satisfies (4x)).7if and only if there is an element c 
in the domain D such that the sequence ($,, 3, ..., C, ...) satisfies .7,) 
VI. Fy, .zif and only if Fy(Vx).2. 
We can extend this result in the following way. By the closure* of 7 we 
mean the closed wf obtained from .7by prefixing in universal quanti- 
fiers those variables, in order of descending subscripts, that are free 
in .%. If .vhas no free variables, the closure of .7is defined to be itself. 
For example, if .7 is Ar (x5,%s5) > (4x2) A? (X1,X2,X3), its closure is (Vx;) 
(Vx3)(Vx2)(VX,).2. It follows from (VJ) that a wf .7is true if and only if its 
closure is true. 

VIL Every instance of a tautology is true for any interpretation. (An instance 
of a statement form is a wf obtained from the statement form by sub- 
stituting wfs for all statement letters, with all occurrences of the same 
statement letter being replaced by the same wf. Thus, an instance of 
A, > 7A; V A, is Ai(x2) => (Vm )Ai(m1)) v Ai(%2),) 

To prove (VID), show that all instances of the axioms of the system L are 
true and then use (III) and Proposition 1.14. 

VIII. If the free variables (if any) of a wf .7 occur in the list x;,,..., x;, and if 
the sequences s and s’ have the same components in the i,th, ..., ith 
places, then s satisfies .7 if and only if s’ satisfies 7 [Hint: Use induc- 
tion on the number of connectives and quantifiers in .4. First prove this 
lemma: If the variables in a term ¢ occur in the list x;,,..., x;,, and if s 
and s’ have the same components in the i,th, ..., ith places, then s*(t) = 
(s’)*(£). In particular, if f contains no variables at all, s*(f) = (s’)*(é) for any 
sequences s and s’] 


Although, by (VIII), a particular wf with k free variables is essentially satis- 
fied or not only by k-tuples, rather than by denumerable sequences, it is more 
convenient for a general treatment of satisfaction to deal with infinite rather 
than finite sequences. If we were to define satisfaction using finite sequences, 
conditions 3 and 4 of the definition of satisfaction would become much more 
complicated. 

Let x;,,..., Xi, be k distinct variables in order of increasing subscripts. Let 
B(Xiy ++, Xi.) be a wf that has xj,,...,X;, aS its only free variables. The set of 
k-tuples (by, ..., by) of elements of the domain D such that any sequence with 
b,, ..., b, in its i,th, ..., ith places, respectively, satisfies 7(xi,,..., X;,) is called 
the relation (or property’) of the interpretation defined by 7. Extending our ter- 
minology, we shall say that every k-tuple (b,, ..., b,) in this relation satisfies 
B(Xi,,---, Xi) in the interpretation M; this will be written Fy, .4[D, ..., b,] . This 
extended notion of satisfaction corresponds to the original intuitive notion. 


* A better term for closure would be universal closure. 
* When k = 1, the relation is called a property. 
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Examples 


1. If the domain D of M is the set of human beings, Ai(x,y) is inter- 
preted as x is a brother of y, and A3(x,y) is interpreted as x is a par- 
ent of y, then the binary relation on D corresponding to the wf 

2(X1,X2) : (AX3)(AP(x1,x3) A AZ(x3,X2)) is the relation of unclehood. 
Ex 210, c] when and only when b is an uncle of c. 

2. If the domain is the set of positive integers, A? is interpreted as =, fi is 

interpreted as multiplication, and a, is interpreted as 1, then the wf .A(x,): 


AAT (x1, a) A (Wx2)((Ax3)Ai(x1, fi (x2, X3)) => Ai (x2, %1)V Ai (x2, a)) 


determines the property of being a prime number. Thus F,, .[k] if 
and only if k is a prime number. 


IX. If zis a closed wf of a language , then, for any interpretation M, 
either Fy, .7 or Fy, —.7—that is, either .7is true for M or is false for 
M. [Hint: Use (VIII).] Of course, .7 may be true for some interpreta- 
tions and false for others. (As an example, consider Ai(a). If M is 
an interpretation whose domain is the set of positive integers, Aj is 
interpreted as the property of being a prime, and the interpretation 
of a, is 2, then Aj(a;) is true. If we change the interpretation by inter- 
preting a, as 4, then A;(a,) becomes false.) 

If .7is not closed—that is, if .7 contains free variables—.7 may be 
neither true nor false for some interpretation. For example, if .7 is 
Ai(x1,X2) and we consider an interpretation in which the domain 
is the set of integers and Ai(y,2) is interpreted as y < z, then 7 is 
satisfied by only those sequences s = ($1, 83, ...) of integers in which 
S$, < 8). Hence, .7is neither true nor false for this interpretation. On 
the other hand, there are wfs that are not closed but that neverthe- 
less are true or false for every interpretation. A simple example is the 
wf Aj(x1) vA} (x1), which is true for every interpretation. 


X. Assume t is free for x; in A(x). Then (Vx,). (x) > .7(6 is true for all 
interpretations. 


The proof of (X) is based upon the following lemmas. 


Lemma 1 


If t and u are terms, s is a sequence in , t’ results from ¢ by replacing all 
occurrences of x; by u, and s’ results from s by replacing the ith component 
of s by s*(u), then s*(t’) = (s’)*(#). [Hint: Use induction on the length of t.*] 


* The length of an expression is the number of occurrences of symbols in the expression. 
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Lemma 2 


Let t be free for x; in .7(x;). Then: 


a. A sequences s = (5,, 55, ...) satisfies .4(f) if and only if the sequence s’, 
obtained from s by substituting s*(f) for s; in the ith place, satisfies .4(x)). 
[Hint: Use induction on the number of occurrences of connectives and 
quantifiers in A(x), applying Lemma 1] 

b. If (Vx). A(x) is satisfied by the sequence s, then .4(f) also is satisfied by s. 


XI. If .7does not contain x; free, then (Vx)(7> 7) > (%7> (Vx)7) is true for 
all interpretations. 


Proof 


Assume (XI) is not correct. Then (Vx)(7> 7) > (7> (Vx))7) is not true for some 
interpretation. By condition 3 of the definition of satisfaction, there is a sequence 
s such thats satisfies (Vx)(7=> 7) and s does not satisfy .7=> (Vx,). From the latter 
and condition 3, s satisfies .zand s does not satisfy (Vx;)~. Hence, by condition 4, 
there is a sequence s’, differing from s in at most the ith place, such that s’ does 
not satisfy ~. Since x; is free in neither (Vx)(.7 => ~) nor .% and since s satisfies 
both of these wfs, it follows by (VIII) that s’ also satisfies both (Vx)(7=> 7) and..7. 
Since s’ satisfies (Vx)(.7=> 7), it follows by condition 4 that s’ satisfies .7=> ~ Since 
s’ satisfies .7=> “and .4%, condition 3 implies that s’ satisfies , which contradicts 
the fact that s’ does not satisfy ~. Hence, (XI) is established. 


Exercises 


2.12 Verify (I)-(X). 
2.13 Prove that a closed wf vis true for M if and only if vis satisfied by some 


sequence s in X. (Remember that X is the set of denumerable sequences 
of elements in the domain of M.) 


2.14 Find the properties or relations determined by the following wfs and 

interpretations. 

a. [(Au)Ar(fr(x,u), y)] A [(S)Ai(f7 (x,0),2z)], where the domain D is the 
set of integers, Aj is =, and f? is multiplication. 

b. Here, D is the set of nonnegative integers, Aj is =, a, denotes 0, f/ is 
addition, and f? is multiplication. 
i [(Az)(AT(Z,m) a At(fr (x,2),y))] 
ii, GWAT(e, (YY) 

c. (Ax3)Ai(f?(x1,x3), x2), where D is the set of positive integers, Aj is =, 
and f; is multiplication, 
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d. Al(x1) A (Vx2)-A7 (x1, 2X2), where D is the set of all living people, Ai(x) 
means x is a man and Aj(x,y) means x is married to y. 
e. i. (Ax,)(Ax2)(A?(x1,%3) A A(X, X4) A AZ (X1,X2)) 
il. (Ax3)(AT(%1,X3) A At (x3,X2)) 
where D is the set of all people, A?(x,y) means x is a parent of y, 
and A3(x, y) means x and y are siblings. 
£ (Wx3)((Axs)(Ar(f7 (Xa, X3),%1) A (Axa (Ai(fi (%4, X3), X2)) => Ai(X3, a1)), 
where D is the set of positive integers, Ai is =, ft is multiplication, 
and a, denotes 1. 


g. aAT(X2,%1) A (Ay)(Ar(y, x1) A Az(x2,y)), where D is the set of all peo- 
ple, Aj (u,v) means u is a parent of v, and A3(u,v) means u is a wife 
of v. 
2.15 For each of the following sentences and interpretations, write a transla- 
tion into ordinary English and determine its truth or falsity. 


a. The domain D is the set of nonnegative integers, Aj is =, f? is addi- 
tion, fz is multiplication, a, denotes 0, and a, denotes 1. 


(vx) Gy Ar(x, (yy) Vv At (x, FE (Y, Y),4))) 
il. eer (f2(x,y),1) => At(x,a1)v At(y,a)) 
iii. (Sy)Ar(fr (y, y),42) 
b. Here, D is the set of integers, A? is =, and fr is addition. 
i. (Wx (Vx2)AT( fr (1,2), fr (%2,%1)) 
ii, (Wx1)(WX2)(Vx3)AP (AP (x1, FP (X2,%3)), ft (X1, X2), X3)) 
iii. (W2x1)(Wx2)(Ax3)AT(f1 (X1, 3), X2) 
c. The wfs are the same as in part (b), but the domain is the set of posi- 
tive integers, Aj is =, and f/?(x,y) is x¥. 


d. The domain is the set of rational numbers, Af is =, Aj is <, f? is mul- 
tiplication, f/(x) is x + 1, and a, denotes 0. 


i. (Ax)AT(f?(x,%), fi fi (a))) 
ii, (Vx)(Vy)(A3(x, y) => (Az)(A3(x,z) A A3(z, y))) 
iii, (Vx)(AA2(x, a1) => (Ay) A2(f2 (x,y), Ai(ar))) 


e. The domain is the set of nonnegative integers, Aj (u,v) means u < v, 
and Ai(u, v,W) means U+ VU = Ww. 


 (Wx)\(Vy)(V ZAI (x, YZ) => Aly, x,2)) 
i. (Vx)(Wy)(Ai(x, x,y) => At(x,y)) 
iii. eer s y) => Ai(x,x,y)) 

iv. (Ax)(Vy)Ai(x,y,y) 
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v. (Ay)(Vx)Ar(x, y) 
vi. (Vx)(Vy)(Ai(x, y) = (Az) Ai(x,z,Y)) 

f. The domain is the set of nonnegative integers, Ai (u,v) means 
u=v, fi(u,v)=utv,and f7(u,v)=u-v 


(Vx)(y)A2)Ai(x, (fey, y), f2(Z,2))) 


— 
1 


Definitions 


A wf % is said to be logically valid if and only if is true for every 
interpretation.* 

Zis said to be satisfiable if and only if there is an interpretation for which .4 
is satisfied by at least one sequence. 

It is obvious that .7is logically valid if and only if —.7is not satisfiable, and 
Zis satisfiable if and only if —.vis not logically valid. 

If zis a closed wf, then we know that .7is either true or false for any given 
interpretation; that is, .7is satisfied by all sequences or by none. Therefore, if 
Zis closed, then .#is satisfiable if and only if vis true for some interpretation. 

A set I’ of wfs is said to be satisfiable if and only if there is an interpretation 
in which there is a sequence that satisfies every wf of C. 

It is impossible for both a wf and its negation —.7to be logically valid. 
For if vis true for an interpretation, then — vis false for that interpretation. 

We say that .7is contradictory if and only if .vis false for every interpreta- 
tion, or, equivalently, if and only if =.7is logically valid. 

Z is said to logically imply ~ if and only if, in every interpretation, every 
sequence that satisfies 7 also satisfies ~. More generally, vis said to be a logi- 
cal consequence of a set I of wfs if and only if, in every interpretation, every 
sequence that satisfies every wf in I also satisfies ~. 

zand vare said to be logically equivalent if and only if they logically imply 
each other. 

The following assertions are easy consequences of these definitions. 


1. % logically implies v if and only if 47> ~ is logically valid. 

2. “and ~ are logically equivalent if and only if 7 ~ is logically valid. 

3. If % logically implies “and 7 is true in a given interpretation, then 
SO is 7. 

4. If ~ is a logical consequence of a set F' of wfs and all wfs inT are true 
in a given interpretation, then so is 7. 


* The mathematician and philosopher G.W. Leibniz (1646-1716) gave a similar definition: .7 is 
logically valid if and only if vis true in all “possible worlds.” 
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Exercise 


2.16 Prove assertions 1—4. 
Examples 


1. Every instance of a tautology is logically valid (VID). 

2. If fis free for x in .A(x), then (Vx). 7(x) > .4( is logically valid (X). 

3. If .4 does not contain x free, then (Vx)(47> 7) > (47> (Vx)7) is logi- 
cally valid (XI). 

4. vis logically valid if and only if (Vy,) ... (Vy,).7is logically valid (VI). 

5. The wf (Vx2)(Sx1)Ai(x1,%2) > (Sx1)(Vx2)AT(%1,X2) is not logically 
valid. As a counterexample, let the domain D be the set of integers 
and let Af(y,z) mean y < z. Then (Vx2)(4x1)A7(x1,X2) is true but 

3x1)(Vx2)A7(x1,X2) is false. 


— 


Exercises 


2.17 Show that the following wfs are not logically valid. 
[(Vx1) Ai (x1) => (Vx) A2(%1)] = [011 ) (At) > A2(%1))] 
[(Vx1)(Ai (x1) v A3(x1))] > [((V1))At(241)) v (1) A2(24)] 

2.18 Show that the following wfs are logically valid.* 

a. .A(t) > (Ax).Ax) if tis free for x; in 7x) 
b. (Vx).7=> (AXx).7 

c. (Wx)(Vx).7 => (Wx)(VX).4 

d. (Wx). 7e 7AAx)AF 

e (Wx 7> 7 > (Vx). 7 > (Vx) 

f. ((Wx).A) A (Wx) 7S (VX)(FGA A 

g. (Wx)A V (Wx) 7 > Wx)-4V 7) 

h. (Axj)(Ax).4 & (Ax)(Ax).4 

i. (Axj)(Vx).4=> (Wx )(Ax).4 

2.19 a. If.vis aclosed wf, show that .7 logically implies ~if and only if 7 is 

true for every interpretation for which .7is true. 


b. Although, by (VD, (Vx:)Ai(x;) is true whenever Aj(x,) is true, find 
an interpretation for which Aj(x,) > (Vx1)A1(x1) is not true. (Hence, 
the hypothesis that .7is a closed wf is essential in (a).) 


* At this point, one can use intuitive arguments or one can use the rigorous definitions of 
satisfaction and truth, as in the argument above for (XI). Later on, we shall discover another 
method for showing logical validity. 
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2.20 Prove that, if the free variables of are yj, ..., y,, then is satisfiable if 
and only if (Ay,), ..., €y,).7is satisfiable. 


2.21 


2.22 


Produce counterexamples to show that the following wfs are not logi- 
cally valid (that is, in each case, find an interpretation for which the wf 
is not true). 


mo ana 


or 
h. 


i. 


[(vx)(Vy (Vz AT(x, y) A ALY, Z) => Ai (x,Z)) A (VX) AAT (x, x)] 
= (Ax)(Vy)-Ar (x,y) 
(vx)(Ay)Ar(x, y) => Gy)AT(y,y) 
(Ax)y)Ar(x,y) => Gy)AT(y,y) 
[(Ax)Ai(x) <> (Ax)A3(x)] => (Vx)(Ai(x) = Az(x)) 
(Ax)(Ai(x) => A2(x)) => ((Ax)Ai(x) = (Ax) A2(x)) 
[(vx)(Vy)(Ar(x,y) => Ar(y,x)) a (Vx)(Vy)(Vz)(AT(x,y) A AT(y,2) 
=> Ai(x,z))] => (Vx)A?(x,x) 
(Ax)(Vy)(Ai(x,y) AAAT(Y,x) = [AT (x,x) @ AT(y,y)) 
(Wx)(Vy)(Vz)(At(x,x) A (Ai(x,z) => At(x,y) v Ai(y,Z))) 
=> (Ay)(Vz)AT(y,2) 
(Ax)(Vy)(z)(Ai(y,z) => Ai(x,2)) = (At (x,x) => At(y,x))) 


By introducing appropriate notation, write the sentences of each of the 
following arguments as wfs and determine whether the argument is 
correct, that is, determine whether the conclusion is logically implied 
by the conjunction of the premisses 


a. 


b. 


All scientists are neurotic. No vegetarians are neurotic. Therefore, 
no vegetarians are scientists. 


All men are animals. Some animals are carnivorous. Therefore, 
some men are carnivorous. 


Some geniuses are celibate. Some students are not celibate. 
Therefore, some students are not geniuses. 


Any barber in Jonesville shaves exactly those men in Jonesville who 
do not shave themselves. Hence, there is no barber in Jonesville. 


For any numbers x, y, z, if x > y and y > z, then x > z. x > x is false for 
all numbers x. Therefore, for any numbers x and y, if x > y, then it is 
not the case that y > x. 


No student in the statistics class is smarter than every student in 
the logic class. Hence, some student in the logic class is smarter 
than every student in the statistics class. 


Everyone who is sane can understand mathematics. None of 
Hegel’s sons can understand mathematics. No madmen are fit to 
vote. Hence, none of Hegel's sons is fit to vote. 
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h. 


For every set x, there is a set y such that the cardinality of y is greater 
than the cardinality of x. If x is included in y, the cardinality of x 
is not greater than the cardinality of y. Every set is included in V. 
Hence, V is not a set. 


For all positive integers x, x < x. For all positive integers x, y, z, if 
x <yand y <z, then x < z. For all positive integers x and y, x < y or 
y <x. Therefore, there is a positive integer y such that, for all posi- 
tive integers x, y < x. 

For any integers x, y, z, ifx > y and y > z, then x > z. x > x is false for 
all integers x. Therefore, for any integers x and y, if x > y, then it is 
not the case that y > x. 


2.23 Determine whether the following sets of wfs are compatible—that is, 
whether their conjunction is satisfiable. 


a. 


(Ax)(Gy)Ar(x,y) 

(Vx)(Vy)(Az)(At(x, 2) A At (z,y)) 
(vx)(ay)Ar(y, x) 

(vx)(Vy)(At(x,y) > AAi(y, x)) 
(Vx)(Vy)(Vz)(Ar (x,y) A At(y,z) > At(x,2)) 
All unicorns are animals. 


No unicorns are animals. 


2.24 Determine whether the following wfs are logically valid. 


a. 
b. 
c. 


d. 


BY) (Vx)(Ai (x,y) > AAI(x, x) 

[(Ax)Ai(x) > (Ax) A3(x)] > (2x)(Ar(x) > A2(x)) 
(Ax)(Ai(x) > (Vy)Ai(y)) 

(Wx)(Ai(x) v Aa(x)) > (((Wx)Ar(x)) v (Ax) A2(x)) 
(Ax)(Ay)(Ar (x,y) > (Vz)Ar(z, y) 
(Ax)y)(Ai (x) => Aa(y)) > Ax)(Ai(x) => Aa(x)) 


(Wx)(Ai(x) > A2(x)) > (Vx)(Ai(x) > AA2(x)) 


h. (Ax)Ai(x,x) > (Ax) @y)Ar(x,y) 


i. 


j. 


((Ax)At(x)) A (Ax)A2(x) = (Ax)(At(x) a A2(x)) 


((Vx)Ar(x)) v (Wx)Aa(x) > (Wx)(Ai(x) v A2(x)) 


2.25 Exhibit a logically valid wf that is not an instance of a tautology. 
However, show that any logically valid open wef (that is, a wf without 
quantifiers) must be an instance of a tautology. 
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2.26 a. Find a satisfiable closed wf that is not true in any interpretation 
whose domain has only one member. 


b. Find a satisfiable closed wf that is not true in any interpretation 
whose domain has fewer than three members. 


2.3 First-Order Theories 


In the case of the propositional calculus, the method of truth tables provides an 
effective test as to whether any given statement form is a tautology. However, 
there does not seem to be any effective process for determining whether a 
given wf is logically valid, since, in general, one has to check the truth of a 
wf for interpretations with arbitrarily large finite or infinite domains. In fact, 
we shall see later that, according to a plausible definition of “effective,” it may 
actually be proved that there is no effective way to test for logical validity. The 
axiomatic method, which was a luxury in the study of the propositional cal- 
culus, thus appears to be a necessity in the study of wfs involving quantifiers,* 
and we therefore turn now to the consideration of first-order theories. 

Let be a first-order language. A first-order theory in the language » will 
be a formal theory K whose symbols and wfs are the symbols and wfs of 
and whose axioms and rules of inference are specified in the following way.’ 

The axioms of K are divided into two classes: the logical axioms and the 
proper (or nonlogical) axioms. 


2.3.1 Logical Axioms 


If 4, 7,and are wfs of , then the following are logical axioms of K: 
(Al) 43> (¢> 4) 


(A2) (49 (3 > (439349) 
(A3) Ara 74> (Er72Re APD 


* There is still another reason for a formal axiomatic approach. Concepts and propositions 
that involve the notion of interpretation and related ideas such as truth and model are often 
called semantical to distinguish them from syntactical concepts, which refer to simple rela- 
tions among symbols and expressions of precise formal languages. Since semantical notions 
are set-theoretic in character, and since set theory, because of the paradoxes, is considered 
a rather shaky foundation for the study of mathematical logic, many logicians consider a 
syntactical approach, consisting of a study of formal axiomatic theories using only rather 
weak number-theoretic methods, to be much safer. For further discussions, see the pioneer- 
ing study on semantics by Tarski (1936), as well as Kleene (1952), Church (1956), and Hilbert 
and Bernays (1934). 

The reader might wish to review the definition of formal theory in Section 1.4. We shall use 
the terminology (proof, theorem, consequence, axiomatic, F .7, etc.) and notation (TF .4,F a) 
introduced there. 


+ 
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(AA) (Vx). A(x) > ab if Ax) isa wf of y and tis aterm of that is free 
for x; in .4(x,). Note here that t may be identical with x; so that all 
wfs (Vx).7 > .Zare axioms by virtue of axiom (A4). 


(A5) (Vx)(7> 7) > (47> (Vx) if 7contains no free occurrences of x;. 


2.3.2 Proper Axioms 


These cannot be specified, since they vary from theory to theory. A first- 
order theory in which there are no proper axioms is called a first-order predi- 
cate calculus. 


2.3.3 Rules of Inference 


The rules of inference of any first-order theory are: 


1. Modus ponens: 7 follows from wand 47> 7. 


2. Generalization: (Vx,).7 follows from .7. 


We shall use the abbreviations MP and Gen, respectively, to indicate applica- 
tions of these rules. 


Definition 


Let K be a first-order theory in the language . By a model of K we mean an 
interpretation of for which all the axioms of K are true. 

By (III) and (VI) on page 57, if the rules of modus ponens and general- 
ization are applied to wfs that are true for a given interpretation, then the 
results of these applications are also true. Hence every theorem of K is true in 
every model of K. 

As we shall see, the logical axioms are so designed that the logical conse- 
quences (in the sense defined on pages 63-64) of the closures of the axioms of 
K are precisely the theorems of K. In particular, if K is a first-order predicate 
calculus, it turns out that the theorems of K are just those wfs of K that are 
logically valid. 

Some explanation is needed for the restrictions in axiom schemas (A4) 
and (A5). In the case of (A4), if f were not free for x; in .4(x)), the following 
unpleasant result would arise: let (x;) be —(Vx2)A7(x1,%2) and let t be x). 
Notice that ¢t is not free for x, in .4(x,). Consider the following pseudo- 
instance of axiom (A4): 


(V) (0x1) (AC Vx2)AP (a1, X2)) => AV) AT(X2, x2) 
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Now take as interpretation any domain with at least two members and let 
Aj stand for the identity relation. Then the antecedent of (V) is true and the 
consequent false. Thus, (V) is false for this interpretation. 

In the case of axiom (A5), relaxation of the restriction that x; not be free in 
zwould lead to the following disaster. Let. 7and 7 both be Aj(x1). Thus, x; is 
free in .7. Consider the following pseudo-instance of axiom (A5): 


(VV) (Vx) (Al(a1) => Ata) => (Ain) => (Vm) ATC) 


The antecedent of (VV) is logically valid. Now take as domain the set of 
integers and let Aj(x) mean that x is even. Then (Vx1)Aj(x1) is false. So, any 
sequence s = (51, 83, ...) for which s, is even does not satisfy the consequent of 
(VV).* Hence, (VV) is not true for this interpretation. 


Examples of first-order theories 


1. Partial order. Let the language have a single predicate letter Aj and 
no function letters and individual constants. We shall write x; < x; 
instead of A3(x;,x ;). The theory K has two proper axioms. 

a. (Vx) x1 < x)) (irreflexivity) 
b. (Wx )(VXp)(Wxs)(X1 < XA X.< X35 X,<X3) (transitivity) 
A model of the theory is called a partially ordered structure. 

2. Group theory. Let the language have one predicate letter Aj, one 
function letter fies and one individual constant a,. To conform with 
ordinary notation, we shall write t = s instead of Ai(t,s), t+s instead 
of fi (t,s), and 0 instead of a,. The proper axioms of K are: 


a. (Wx,)(VxX>)(WxX5)(X1 +(X> + Xs) (associativity) 
= (X1 + Xp) + X3) 

b.  (Vx,)(0 + x, = x) (identity) 

c. (WX4)(AX)(Xy + x, = 0) (inverse) 

d. (Vx,)(x, = x,) (reflexivity of =) 

€.  (Wx)(VXp)(X1 = Xp > Xq = Xy) (symmetry of =) 

f. (Wx,)(Wx,)(Vx5)(X] =X A Xy = Xz > Xy = X53) (transitivity of =) 

B. (WX4)(VXp)(VX3)(Xq = Xg > Xy + Xy (substitutivity of =) 
=X, +X3 AX) +X, = Xz +X) 


A model for this theory, in which the interpretation of = is the identity rela- 
tion, is called a group. A group is said to be abelian if, in addition, the wf (Vx,) 
(Wxp)(X, + X= X_ + X;) is true. 


* Such a sequence would satisfy Ai(x1), since s, is even, but would not satisfy (Vx) Ai (x1), since 
no sequence satisfies (Vx, )Ai(%1). 
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The theories of partial order and of groups are both axiomatic. In general, 
any theory with a finite number of proper axioms is axiomatic, since it is 
obvious that one can effectively decide whether any given wf is a logical 
axiom. 


2.4 Properties of First-Order Theories 


All the results in this section refer to an arbitrary first-order theory K. Instead 
of writing Fx .4%, we shall sometimes simply write F .7. Moreover, we shall 
refer to first-order theories simply as theories, unless something is said to the 
contrary. 


Proposition 2.1 


Every wf .7 of K that is an instance of a tautology is a theorem of K, and it 
may be proved using only axioms (A1)-(A3) and MP. 


Proof 


# arises from a tautology ./ by substitution. By Proposition 1.14, there is a 
proof of yin L. In such a proof, make the same substitution of wfs of K for 
statement letters as were used in obtaining .7 from .y, and, for all statement 
letters in the proof that do not occur in ., substitute an arbitrary wf of K. 
Then the resulting sequence of wfs is a proof of .4, and this proof uses only 
axiom schemes (A1)-(A3) and MP. 

The application of Proposition 2.1 in a proof will be indicated by writing 
“Tautology.” 


Proposition 2.2 
Every theorem of a first-order predicate calculus is logically valid. 


Proof 


Axioms (A1)-(A3) are logically valid by property (VII) of the notion of truth 
(see page 58), and axioms (A4) and (A5) are logically valid by properties (X) 
and (XI). By properties (III) and (VI), the rules of inference MP and Gen pre- 
serve logical validity. Hence, every theorem of a predicate calculus is logi- 
cally valid. 
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Example 


The wf (‘Vx2)(Ax1)Ai(%1,X2) => (Ax1)(Vx2)Ai(x1,%2) is not a theorem of any first- 
order predicate calculus, since it is not logically valid (by Example 5, page 63). 


Definition 


A theory K is consistent if no wf and its negation —_vare both provable in K. 
A theory is inconsistent if it is not consistent. 


Corollary 2.3 


Any first-order predicate calculus is consistent. 


Proof 


If a wf “and its negation —.4 were both theorems of a first-order predicate 
calculus, then, by Proposition 2.2, both .7 and —.7 would be logically valid, 
which is impossible. 

Notice that, in an inconsistent theory K, every wf 7 of K is provable in K. In 
fact, assume that .zand —.vare both provable in K. Since the wf .75> 47> 7) 
is an instance of a tautology, that wf is, by Proposition 2.1, provable in K. 
Then two applications of MP would yield Fv. 

It follows from this remark that, if some wf of a theory K is not a theorem 
of K, then K is consistent. 

The deduction theorem (Proposition 1.9) for the propositional calculus can- 
not be carried over without modification to first-order theories. For example, 
for any wf .4, .7,(Vx).4% but it is not always the case that Fy .7 => (Vx).2. 
Consider a domain containing at least two elements c and d. Let K be a predi- 
cate calculus and let .7be Aj(x;). Interpret Aj as a property that holds only 
for c. Then Aj(x;) is satisfied by any sequence s = (8), Sy, ...) in which s, = c, 
but (‘Vx1)Ai(%1) is satisfied by no sequence at all. Hence, Ai(x1) > (Vx1)Ai(x1) 
is not true in this interpretation, and so it is not logically valid. Therefore, by 
Proposition 2.2, Ai(X1) > (Vx1)A1(x1) is not a theorem of K. 

A modified, but still useful, form of the deduction theorem may be derived, 
however. Let .7 be a wf in a set T of wfs and assume that we are given a 
deduction “, ..., 4, from T , together with justification for each step in the 
deduction. We shall say that 4 depends upon .7in this proof if and only if: 


1. Ais wand the justification for 4 is that it belongs to I, or 

2. “is justified as a direct consequence by MP or Gen of some preced- 
ing wfs of the sequence, where at least one of these preceding wfs 
depends upon -v. 
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Example 


B, (WX). A> EE (Wx) 7 


(a) 74 Hyp 

(A) (Wx))F (1), Gen 
(A) (Wx). 47> 7 Hyp 

(2) (%), (A), MP 
(%) (Wx) (%), Gen 


Here, (4) depends upon .% (“%) depends upon .%, (%) depends upon (Vx,) 
Z=> ¢,(*) depends upon vand (Vx)).7> 7, and (%) depends upon .vand 
(Vx) A> 7. 


Proposition 2.4 


If ~does not depend upon .7in a deduction showing that, 7 7, then F v. 


Proof 


Let % ..., Y, be a deduction of 7 from T and .%, in which ~ does not depend 
upon .%. (In this deduction, %, is ~.) As an inductive hypothesis, let us 
assume that the proposition is true for all deductions of length less than n. If 
¢ belongs to I or is an axiom, then IF ~ If 7 is a direct consequence of one 
or two preceding wfs by Gen or MP, then, since 7 does not depend upon .¥%, 
neither do these preceding wfs. By the inductive hypothesis, these preceding 
wfs are deducible from I alone. Consequently, so is 7. 


Proposition 2.5 (Deduction Theorem) 


Assume that, in some deduction showing that I, z+ 7, no application of Gen 
to a wf that depends upon .¥has as its quantified variable a free variable 
of 4% Then .47> 7 


Proof 


Let %, ..., Y, be a deduction of “from I and .%, satisfying the assumption of 
our proposition. (In this deduction, %, is 7.) Let us show by induction that I 
+ #=> 9,foreachi <n. If %is an axiom or belongs to[, then + .7> %, since 
Y=> (47> ¥Y)is an axiom. If Gis 4 then 7> 4%, since, by Proposition 
21,4 #=> 2. If there exist j and k less than i such that % is => D, then, by 
inductive hypothesis, PF 74> 7,andDF .4> (4 => %). Now, by axiom (A2), 
F (43> (G> 4) > (42> Y > (4> 4%). Hence, by MP twice, TF 24> 7%, 
Finally, suppose that there is some j <i such that 7; is (Vx,) 7;. By the inductive 
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hypothesis, TF .4=> %, and, by the hypothesis of the theorem, either 7; does 
not depend upon .4 or x, is not a free variable of .4. If 7 does not depend 
upon .%, then, by Proposition 2.4, 0 + D, and, consequently, by Gen, I’ F (Vx,) 
yj, Thus, PF 7. Now, by axiom (Al), F 4,5 (47> %).S0,TF 7 7 by MP. If, 
on the other hand, x, is not a free variable of .4, then, by axiom (A5), F (Vx,) 
(4 > 4) > (47> (Wx) %). Since DF .4=> %, we have, by Gen, PF (Wx,)(4> %), 
and so, by MP, Tk .7=> (Vx,) FD; that is, .7=> 4%. This completes the induc- 
tion, and our proposition is just the special case i = n. 

The hypothesis of Proposition 2.5 is rather cumbersome; the following 
weaker corollaries often prove to be more useful. 


Corollary 2.6 


If a deduction showing that I, 7 involves no application of Gen of which 
the quantified variables is free in .z, then’ .7> ~. 


Corollary 2.7 


If zis aclosed wf andI, 7 v, then’ v>~ 


Extension of Propositions 2.4-2.7 


In Propositions 2.4—2.7, the following additional conclusion can be drawn from 
the proofs. The new proof of [+ .7=> ¢ (in Proposition 2.4, of PF 7) involves 
an application of Gen to a wf depending upon a wf « of I only if there is an 
application of Gen in the given proof of [) .z ~ that involves the same quan- 
tified variable and is applied to a wf that depends upon ~. (In the proof of 
Proposition 2.5, one should observe that 7; depends upon a premiss « of Tin 
the original proof if and only if .7=> 7; depends upon “in the new proof) 

This supplementary conclusion will be useful when we wish to apply the 
deduction theorem several times in a row to a given deduction—for example, 
to obtain’ + 7>(%> 7) fromT, %,.7- 7; from now on, it is to be considered 
an integral part of the statements of Propositions 2.4—2.7. 


Example 
(Vx1)(Vx2) b> (Vx2)(VX1) DB 
Proof 
1. (Wx,)(Vx2).4 Hyp 


2. (Wx,)(Wx,). 2 (Wx) 7 (A4) 
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3. (WX). 4 1,2, MP 
4. (WX) 47> 4 (A4) 

5. B 3, 4, MP 
6. (Wx4).F 5, Gen 
7. (WX_)(WX}).7 6, Gen 


Thus, by 1-7, we have (Vx,)(Vx2).7  (Vx,)(Vx,).4%, where, in the deduction, no 
application of Gen has as a quantified variable a free variable of (Vx,)(Vx,).2. 
Hence, by Corollary 2.6, F (Wx1)(Vxq).7 > (Wx2)(Vx,).7. 


Exercises 


2.27 Derive the following theorems. 
a F(x 2> 9 > (WX) 7> (VX)A 
b. F (WxX)( 47> 7) > (AX). 47> (AX)7) 
c. FE (VX\.ZA 7) & (WX).A) A (VX) 7 
d. Fk (Wy,) ... WY, 47> F 
e.  A(Vx).7=> (Ax) 27 
2.28? Let K be a first-order theory and let K* be an axiomatic theory having 
the following axioms: 
a. (Vy,) ... (Vy,).4 where .vis any axiom of K and y,, ..., y,(n 2 0) are 
any variables (none at all when n = 0); 
b. (Wy)... Wy M4> A => [(Vy) .-. Vy,).4 => (Vy) ... (Vy,)7] where 7 
and vare any wfs and y, ..., y,, are any variables. 
Moreover, K* has modus ponens as its only rule of inference. Show 
that K* has the same theorems as K. Thus, at the expense of adding 
more axioms, the generalization rule can be dispensed with. 


2.29 Carry out the proof of the Extension of Propositions 2.4—2.7 above. 


2.5 Additional Metatheorems and Derived Rules 


For the sake of smoothness in working with particular theories later, we 
shall introduce various techniques for constructing proofs. In this section it 
is assumed that we are dealing with an arbitrary theory K. 

Often one wants to obtain .7(#) from (Vx).7(x), where t is a term free for x in 
4(x). This is allowed by the following derived rule. 
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2.5.1 Particularization Rule A4 


If tis free for x in a(x), then (Vx). A(x) F .A)* 


Proof 


From (Vx).2(x) and the instance (Vx).2(x) > .A(f) of axiom (A4), we obtain .A(f) 
by modus ponens. 

Since x is free for x in .4(x), a special case of rule A4 is: (Vx). 7.2. 

There is another very useful derived rule, which is essentially the contra- 
positive of rule A4. 


2.5.2 Existential Rule E4 


Let t be a term that is free for x in a wf .(x, f), and let .A(f, t) arise from .4(x, t) by 
replacing all free occurrences of x by t. (4(x, f) may or may not contain occur- 
rences of t.) Then, .A(E, f) F (Ax). a(x, B) 


Proof 


It suffices to show that +. 4(t, t) > (Ax). A(x, f). But, by axiom (A4), F(Wx)7.4(x, £) 
=> 7A, ft). Hence, by the tautology (A > 7B) > (B = 7A) and MP, F.%, f) > 
a(Vx)7. A(x, t), which, in abbreviated form, is + 7, t) > (Ax). 7(%, f). 

A special case of rule E4 is .(f) F (Ax). 4(x), whenever t is free for x in A(x). 
In particular, when t is x itself, 7(x) F (4x). A(x). 


Example 


F (WxX).7 => (AX).F 


1. (Wx)A Hyp 

2. B 1, rule A4 

3. (Ax).7 2, rule E4 

4. (Wx). FE (AXx).F 1-3 

5. F (Wx). 7 => (AXx).F 1-4, Corollary 2.6 


The following derived rules are extremely useful. 


Negation elimination: BEB 

Negation introduction: BLA 

Conjunction elimination: BNCF B 
BNOE 


AFA ADF ABV Az 


* From a strict point of view, (Vx).A(x) F .7(f) states a fact about derivability. Rule A4 should be 
taken to mean that, if (Vx).4(x) occurs as a step in a proof, we may write .(f) as a later step 
(if t is free for x in .4(x)). As in this case, we shall often state a derived rule in the form of the 
corresponding derivability result that justifies the rule. 
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Conjunction introduction: B, EF BNE 
Disjunction elimination: BN 6, ABE & 
BN GCE B 
AUZV AYE AGATA? 
B>ODC> ID, BNCEG 


Disjunction introduction: BE BN © 
CE BV G 
Conditional elimination: B>EACE AG 


B>AGEFAG 
AZ>GACE FB 


AZDPALCER A 


=( E> 7) + @ 

AZ> Ab 7 
Conditional introduction: BAC AA> 7 
Conditional contrapositive: B> CF AC> AB 

W> APE AD 
Biconditional elimination: BS %, BEE BS EABE AG 


BO CCF BBPOZACF AG 

BCE BRCBPEE-C> B 
Biconditional introduction: B>C062> BASS 
Biconditional negation: BECP ABSA 

AZOEaALE ZOE? 


Proof by contradiction: If a proof of [, 24+ 7 A 77 involves no application 
of Gen using a variable free in .4 then I kv. (Similarly, one obtains [ + 7.7 
from I) ZzEvA 77) 


Exercises 


2.30 Justify the derived rules listed above. 
2.31 Prove the following. 
a. b(Vx)(Vy)AP(x,y) > (Wx)AR(ax, x) 
F [(Vx).4] v [(Wx) 7] > (WX)(4V 7) 
F 7(4x). 7 => (Vx) 2.7 
F (Vx). 72> (VXV.FV 7 
L (Wx)(Vy)(A2(x,y) => VA?(y, x)) => (Wx) AI(e, x) 
F [(Ax) 42> (Vx)7] > (Vx) 47> 7) 
F (Wx). ZV 7) => [(Vx).2] V (Ax)z 
F (Wx)(At(x, x) > Ay)Ar(x, y)) 


PW moana & 
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i EF Wxy(2> 7 => [(VWX) 47> (WX) 74] 
j FAIA) > (Wy)AIYy)] 
k. [Ox \(Vy)( AC, Y) > AY.) A (WxN(Vy)CW2) AC, Y) A 
AY, 2) > A(x, Zl > (Wx) (Vy) AC, Y) > A(x, 2) 
LF Ax)AT(x,x) > Ax)(Sy)Ai(x,y) 
2.32 Assume that .7 and 7 are wfs and that x is not free in .% Prove the 
following. 
a. F.2> (VX).F 
b F@AxN)7> 7 
c F(7> (VX)7) 8 (VX)(47> 7 
d. F(Aax)7> 4 e (VX)(7 > 2) 
We need a derived rule that will allow us to replace a part “of a wf 


by a wf that is provably equivalent to ~. For this purpose, we first must 
prove the following auxiliary result. 


Lemma 2.8 


For any wfs .vand 7, + (Wx).47@ 7) > ((VWx).7 & (VXx)7). 


Proof 

1. (Vx). 72 7) Hyp 

2. (WX).7 Hyp 

3. FOC 1, rule A4 

4. ZB 2, rule A4 

5. 7 3, 4, biconditional elimination 
6. (Wx) 7 5, Gen 

7. (Wx)(4@ 2), (WX).AZE (Wx) 2 1-6 

8. (Vx) 4S AE (VX). 4 => (Vx) 7 1-7, Corollary 2.6 

9. (Vx) 42 7) (Wx) 7 => (VX).Z Proof like that of 8 
10. (Vx). 47 7) (WX). 4 @ (Vx) 8, 9, Biconditional introduction 


11k (Wx) 48 7) > (VX). 7S (VX)7) ~—-:1+10, Corollary 2.6 


Proposition 2.9 


If ~is a subformula of .4, .7’ is the result of replacing zero or more occur- 
rences of ~in “by a wf ¥, and every free variable of 7 or 7 that is also a 
bound variable of .zoccurs in the list y,, ..., y,, then: 


a. F [(Vy,) ... Wy“ eS 2] > (4 .7’) (Equivalence theorem) 
b. If ve 9, then .7@ 7’ (Replacement theorem) 
c If ve gandt 4% thenk 7’ 
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Example 
a. F (Wx)(Ai(x) > A2(x)) = [(Ax)Ai(x) = Gx) A2(x)] 


Proof 


a. We use induction on the number of connectives and quantifiers in 
Note that, if zero occurrences are replaced, .7’ is .7 and the wf to be 
proved is an instance of the tautology A > (B © B). Note also that, if 7 
is identical with .7and this occurrence of 7 is replaced by % the wf to 
be proved, [(Vy,) ... Wy.(~e A] > (4% .7’), is derivable by Exercise 
2.27(d). Thus, we may assume that 7 is a proper part of .7 and that at 
least one occurrence of vis replaced. Our inductive hypothesis is that the 
result holds for all wfs with fewer connectives and quantifiers than 
Case 1. zis an atomic wf. Then 7 cannot be a proper part of .7. 

Case 2. vis 7. Let 2’ be 7’. By inductive hypothesis, + [(Vy,) ... (Wy,) 
(4@ 2) => («@ “). Hence, by a suitable instance of the tautology (C > 
(A = B)) > (C> (AS -8B)) and MP, we obtain  [(Vy,) ... (Wy )(7e J] 
>(%2 7’). 

Case 3. vis “=> 4 Let 2 be «’ > ”. By inductive hypothesis, F [(Vy,) ... 
Wee A> (eS “and [(Vy,) ... Wyd(oe A] > (ve .”). Using a 
suitable instance of the tautology 


(A>(BSC))A(AS(DS £)) > (A>[BSD) Se (C> E)) 


we obtain [(Vy,) ... (Wy)(7e 2) > (4% .7’). 
Case 4. .zis (Vx) Let .7’ be (Vx)~’. By inductive hypothesis, + [(Vy,) ... 
(Vy,) (7 2)] > (@@ @’). Now, x does not occur free in (Vy) ... (Vy;) 
(7 @ 7) because, if it did, it would be free in 7 or 7 and, since it is 
bound in .% it would be one of yj, ..., y, and it would not be free in 
(Vy,) ... (Vy,)(7 @ 7). Hence, using axiom (A5), we obtain F (Vy;) ... 
(WY)(7& A) > (Wx)(4@ «'). However, by Lemma 2.8,  (Vx)(“S 4) > 
(Wx) (Vx)«"’). Then, by a suitable tautology and MP, F [(Vy;) ... (Wy,) 
(7@ 2) > (4% 7’). 

b. From 7 7, by several applications of Gen, we obtain k (Vy) ... (Wy) 
(7@ 7). Then, by (a) and MP,F 7s .7. 

c. Use part (b) and biconditional elimination. 


Exercises 


2.33 Prove the following: 
a. F (Ax) 37% 7(VX).7 
b. F (Wx).7< 7(Ax) AZ 
c (Ax 47> -A¢V 2) > (AX\(47> 277A 77) 
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d. F(WW)(AY)(47> 4 & WHNEYCZV 7) 
e EF (Wx)(7> 77) & A(AX)(ZA 7) 

2.34 Show by a counterexample that we cannot omit the quantifiers (Vy,) ... 
(Vy,) in Proposition 2.9(a). 

2.35 If ~ is obtained from .4 by erasing all quantifiers (Vx) or (4x) whose 
scope does not contain x free, prove that k .7© «. 

2.36 For each wf .7below, find a wf such that k ~ —vand negation signs 
in “apply only to atomic wfs. 
a. (Wx)(Vy)(Az)Ai(x,y,Z) 
b. (Wey(e > 0 > (486 > 0A (Vx)([x-c| <8 > |fx) -fO | <2) 
c. (Vee >O0> An\(Vm\(m>n => |a,,—b| <8) 

2.37 Let .zbe a wf that does not contain > and ©. Exchange universal and 


existential quantifiers and exchange A and V. The result is called the 
dual of 4. 


a. In any predicate calculus, prove the following. 
i. - vif and only if F +.4* 
ii. / => vif and only ifF 7* > 4% 
iii. -/ 4 vif and only ifF v* @ ~* 
iv. -F (Ax)(4v 7) @ [(4x).4 Vv (Ax)7]. [Hint: Use Exercise 2.27(c).] 
b. Show that the duality results of part (a), (i)—(iii), do not hold for arbi- 
trary theories. 


2.6 Rule C 


It is very common in mathematics to reason in the following way. Assume 
that we have proved a wf of the form (4x).4(x). Then we say, let b be an object 
such that .4(b). We continue the proof, finally arriving at a formula that does 
not involve the arbitrarily chosen element b. 

For example, let us say that we wish to show that (Ax)(7(x) > 7(X), 
(Wx). A(x) F (Ax) 7 (x). 


1. (Ax)(47@) => 7X) Hyp 

2. (Wx).7(Xx) Hyp 

3. .4(b) > 7(b) for some b 1 

4. .7(b) 2, rule A4 
5. 7(b) 3, 4, MP 


6. (Ax) 7(x) 5, rule E4 
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Such a proof seems to be perfectly legitimate on an intuitive basis. In fact, 
we can achieve the same result without making an arbitrary choice of an ele- 
ment b as in step 3. This can be done as follows: 


1. (Vx).4(x) Hyp 

2. (Wx) 77(x) Hyp 

3. A(x) 1, rule A4 

4. 37(x) 2,rule A4 

5. AC A(x) > 7(x)) 3, 4, conditional introduction 
6. (Vx) AC A(x) > 7(x)) 5, Gen 

7. (Wx).2(x), (Vx) 37x) 1-6 


F (Wx) =(4(x) > -) 
8. (Vx).2(x) F (Vx) a(x) > 
(Wx) =C4(X) > 7) 
9. (Wx). A(x) F AVX) 24x) > 
7(x)) > A(Wx) a7 (x) 
10. (Vx). 2(x) F (AX) 4) > 
7 (x) > Ax) (x) 
11. (Ax)(4(x) > 7), 10, MP 
(Wx).2(x) F (Ax) 7 (x) 


1-7, corollary 2.6 
8, contrapositive 


Abbreviation of 9 


In general, any wf that can be proved using a finite number of arbitrary 
choices can also be proved without such acts of choice. We shall call the rule 
that permits us to go from (4x).7(x) to .7(b), rule C (“C” for “choice”). More 
precisely, a rule C deduction in a first-order theory K is defined in the follow- 
ing manner: F. .vif and only if there is a sequence of wfs “, ..., 7, such that 
Y,is #and the following four conditions hold: 


1. For eachi < n, either 
a. %,is an axiom of K, or 
b. isin, or 
c. 4% follows by MP or Gen from preceding wfs in the sequence, or 
d 


there is a preceding wf (Ax)7(x) such that 7 is ~(d), where d is a 
new individual constant (rule C). 


N 


. As axioms in condition 1(a), we also can use all logical axioms that 
involve the new individual constants already introduced in the 
sequence by applications of rule C. 


80 Introduction to Mathematical Logic 


3. No application of Gen is made using a variable that is free in some 
(4x)~(x) to which rule C has been previously applied. 


4. “contains none of the new individual constants introduced in the 
sequence in any application of rule C. 


A word should be said about the reason for including condition 3. If an appli- 
cation of rule C to a wf (Ax)7 (x) yields ~(d), then the object referred to by d 
may depend on the values of the free variables in (4x) (x). So that one object 
may not satisfy ~(x) for all values of the free variables in (Ax)(x). For exam- 
ple, without clause 3, we could proceed as follows: 


1. (Vx)(y)Ar(x,y) Hyp 

2. (Ay)Ai(x,y) 1, rule A4 
3. Ai(x,d) 2, rule C 
4. (Vx)A2(x,d) 3, Gen 

5. (Ay)(Vx)Ai (x,y) 4, rule E4 


However, there is an interpretation for which (Vx)(Sy)A?(x,y) is true but 
(Ay)(Vx)Ai (x,y) is false. Take the domain to be the set of integers and let 
Ai(x,y) mean that x < y. 


Proposition 2.10 


If. 4% then + % Moreover, from the following proof it is easy to verify 
that, if there is an application of Gen in the new proof of .4 from I using a 
certain variable and applied to a wf depending upon a certain wf of I, then 
there was such an application of Gen in the original proof* 


Proof 


Let (4y,)4(y1), .--, Gy) 4(y,) be the wfs in order of occurrence to which rule C 
is applied in the proof of fF. .% and let dj, ..., d, be the corresponding new 
individual constants. Then T, 4(d), ..., 4(d,) - .2 Now, by condition 3 of the 
definition above, Corollary 2.6 is applicable, yielding , “4(d,), ..., 4-1(di-1) F 
“(d,) => .%. We replace d, everywhere by a variable z that does not occur in 
the proof. 

Then 


lr, &(d,),..., Ra(ha) G(Z) > F 


* The first formulation of a version of rule C similar to that given here seems to be due to Rosser 
(1953). 
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and, by Gen, 


T, A(hh), ..., Aa(dea) F (V2Z)(4(2z) > 7) 


Hence, by Exercise 2.32(d), 


T, A(d,), Poy “e-1(dk-1) F (Ayn)Aa(yn) > a 


But, 


T, 4(dy),..., 4-a(dea) F (Ayn aye) 
Hence, by MP, 
T, 4(d), .-, Gada) 


Repeating this argument, we can eliminate ~%_,(d,_,), .... 4(d,) one after the 
other, finally obtaining [+ 


Example 
F (Vx). A(x) => 7(X)) = (Ax). A(x) => (Ax)7 (x)) 
L. (Wx)(4x) > 7 @)) Hyp 
2. (Ax).4(x) Hyp 
3. .A(a) 2, rule C 
4. 7d) > 7d) 1, rule A4 
5. ¢(d) 3, 4, MP 
6. (Ax) 7(x) 5, rule E4 
7. (Wx)( 2 (x) > @(x)), (AX). 2 (x) Fe (Ax) 7 (x) 1-6 
8. (Vx). A(x) > 7%), (AX). A(x) F (AX) (x) 7, Proposition 2.10 
9. (Vx)( A(x) > 7(x)) F (Ax). A(x) > (Ax) 7 (x) 1-8, corollary 2.6 


10. F (Wx). 4x) > 7X) > (AX).47) > (AX)7(X)_«1-9, corollary 2.6 


Exercises 


Use rule C and Proposition 2.10 to prove Exercises 2.38—2.45. 


2.38 | (Ax) 4(x) > 7(x) > (WX). 4 (0) > AX) 7 (0) 
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2.39 + (Sy )(Vx)(AP(x,y) <> AP, x)) 

2.40 + [(Vx)(Ai(x) > A2(x) v A3(x)) A (WV x)(At(x) > A2(x))] > Gx)(Ai(x) a 
A3(x)) 

2.41 § [(Ax).2(QX)] A (Vx) 700] > Ax)-40) A 70) 

2.42 + (Ax)7(x) > (Ax)(7~) V 7x) 

2.43 - (Ax)(Ay). A(x, y) & (Ay)(Ax). (x, y) 

2.44 + (Ax)(Wy) 20%, y) > (WyEX).7@x, y) 

2.45 + (Ax) 4(x) A 7x) > (AX). 4X) A (AX) 7X) 

2.46 What is wrong with the following alleged derivations? 


a. 1. (ax).7(x) Hyp 
2. 4d) 1, rule C 
3. (Ax)7(x) Hyp 
4. ¢(d) 3, rule C 
5. 2d) A 7a) 2, 4, conjunction introduction 
6. (Ax).-4(xX) A 7X) 5, rule E4 
7. (Ax).2(x), (Ax) 7x) 1-6, Proposition 2.10 
F (Ax)(A(x) A 7(x)) 
b. 1. (Ax( 4) > ~(X) Hyp 
2. (Ax).4(%) Hyp 
3. Ad) => (a) 1, rule C 
4. 2d) 2, rule C 
5. 7(d) 3,4, MP 
6. (Ax)7(x) 5, rule E4 
7. (Ax). A(x) > 70), 1-6, Proposition 2.10 
(Ax).2(x) F (Ax) 7(x) 


2.7 Completeness Theorems 


We intend to show that the theorems of a first-order predicate calculus K are 
precisely the same as the logically valid wfs of K. Half of this result was proved 
in Proposition 2.2. The other half will follow from a much more general prop- 
osition established later. First we must prove a few preliminary lemmas. 

If x; and x; are distinct, then (x) and .(x)) are said to be similar if and only 
if x; is free for x; in .A(x;) and (x) has no free occurrences of x;. It is assumed 
here that (x) arises from .4(x;) by substituting x; for all free occurrences 
of x;. It is easy to see that, if .(x;) and .4(x;) are similar, then x; is free for x; in 
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A(x;) and .A(x)) has no free occurrences of x;. Thus, if .4(x) and .(x)) are simi- 
lar, then .(x)) and .A(x;) are similar. Intuitively, .4(x;) and .(x)) are similar if 
and only if (x) and (x) are the same except that .4(x;) has free occurrences 
of x; in exactly those places where .4(x;) has free occurrences of xj. 


Example 


(Wx) At (m1, x3) Vv Ai(x1) | and (Vx [LAP(x2,x3) Vv Ai(x2) | are similar. 


Lemma 2.11 


If 4(x) and .4(x)) are similar, then F (Vx). 4(x)) > (Wx). 4x). 
Proof 
F (Wx;).4(x)) > A(x) by axiom (A4). Then, by Gen, F (Wx))((Wx;). 4%) > .4(%)), 


and so, by axiom (A5) and MP, F (Wx;). A(x;) => (Wx)). Ax;). Similarly, F (Wx;).4(x) 
> (Vx;).4(x;). Hence, by biconditional introduction, F (Vx;).4(x;) = (Vx)).4(%)). 


Exercises 
2.47 If. 4(x) and .4(x) are similar, prove that F (Ax).4(x)) @ (Ax). (x). 
2.48 Change of bound variables. If .7(x) is similar to .7(y), (Vx).7(x) is a subfor- 


mula of 7, and ~’ is the result of replacing one or more occurrences of 
(Vx).4(x) in “by (Vy).7(y), prove that F 7@ 7’. 


Lemma 2.12 


If a closed wf —.% of a theory K is not provable in K, and if K’ is the theory 
obtained from K by adding .7as a new axiom, then K’ is consistent. 


Proof 


Assume K’ inconsistent. Then, for some wf 7, F,’ and k,’ 77. Now, Fy’ 7> 
(a7 => —%) by Proposition 2.1. So, by two applications of MP, Fy’ =.% Now, 
any use of .4.as an axiom in a proof in K’ can be regarded as a hypothesis 
in a proof in K. Hence, 7, 7.% Since .7is closed, we have kx .4> 1.7by 
Corollary 2.7. However, by Proposition 2.1, (47> 7.4) > 1%. Therefore, by 
MP, Fx 7.4, contradicting our hypothesis. 


Corollary 


If a closed wf 4 of a theory K is not provable in K, and if K’ is the theory 
obtained from K by adding —.7as a new axiom, then K’ is consistent. 
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Lemma 2.13 


The set of expressions of a language is denumerable. Hence, the same is 
true of the set of terms, the set of wfs and the set of closed wfs. 


Proof 


First assign a distinct positive integer g(u) to each symbol wu as follows: 
3) =3, 30) =5, 80) = 7, 80) = 9, 8) = 11, a(V) = 13, g(x) = 13 + 8k, g(a) =7 + 
8k, g(f') =1+8(2"3*), and g(Az) =3+8(2"3") . Then, to an expression Ul, «.. 
u, associate the number 2538") |. p$", where p; is the jth prime number, 
starting with py = 2. (Example: the number of Ai (x2) is 2513°5°75.) We can enu- 
merate all expressions in the order of their associated numbers; so, the set of 
expressions is denumerable. 

If we can effectively tell whether any given symbol is a symbol of , then this 
enumeration can be effectively carried out, and, in addition, we can effectively 
decide whether any given number is the number of an expression of . The 
same holds true for terms, wfs and closed wfs. If a theory K in the language » is 
axiomatic, that is, if we can effectively decide whether any given wf is an axiom 
of K, then we can effectively enumerate the theorems of K in the following man- 
ner. Starting with a list consisting of the first axiom of K in the enumeration just 
specified, add to the list all the direct consequences of this axiom by MP and by 
Gen used only once and with x, as quantified variable. Add the second axiom to 
this new list and write all new direct consequences by MP and Gen of the wfs in 
this augmented list, with Gen used only once and with x, and x, as quantified 
variables. If at the kth step we add the kth axiom and apply MP and Gen to the 
wfs in the new list (with Gen applied only once for each of the variables %,, ..., X,), 
we eventually obtain in this manner all theorems of K. However, in contradis- 
tinction to the case of expressions, terms, wfs and closed wfs, it turns out that 
there are axiomatic theories K for which we cannot tell in advance whether any 
given wf of K will eventually appear in the list of theorems. 


Definitions 
i. A theory K is said to be complete if, for every closed wf .7of K, either 


Fy ZOrk_ A. 


ii. A theory K’ is said to be an extension of a theory K if every theorem of K 
is a theorem of K’. (We also say in such a case that K is a subtheory of K’) 


Proposition 2.14 (Lindenbaum’s Lemma) 


If K is a consistent theory, then there is a consistent, complete extension of K. 
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Proof 


Let .4, .4, ... be an enumeration of all closed wfs of the language of K, by 
Lemma 2.13. Define a sequence Jp, J;, Jz, ... of theories in the following way. 
Jy is K. Assume J,, is defined, with n > 0. If it is not the case that h,, 7.4.1, then 
let J,,,, be obtained from J,, by adding .4,,, as an additional axiom. On the other 
hand, if Fy, +. 4.41, let J,,,1 =J;,. Let J be the theory obtained by taking as axioms 
all the axioms of all the Jjs. Clearly, J;,, is an extension of J;, and J is an exten- 
sion of all the Jjs, including J, = K. To show that J is consistent, it suffices to 
prove that every J; is consistent because a proof of a contradiction in J, involv- 
ing as it does only a finite number of axioms, is also a proof of a contradiction 
in some J;. We prove the consistency of the Jjs, by induction. By hypothesis, 
Jo = Kis consistent. Assume that J; is consistent. IfJ;,,; =J,, then J;,, is consistent. 
If J; £Jj.,and therefore, by the definition of J;,,, >.4,, is not provable in J;, then, 
by Lemma 2.12, J;,; is also consistent. So, we have proved that all the Jjs are 
consistent and, therefore, that J is consistent. To prove the completeness of J, 
let ~ be any closed wf of K. Then v= .4,; for some j = 0. Now, either, 7.441 or 
Hj. 41, since, if it is not the case that F, 7.4.1, then .4,; is added as an axiom 
in Jiu. Therefore, either F, 7.4,, or Fy; .4,;. Thus, J is complete. 

Note that even if one can effectively determine whether any wf is an axiom 
of K, it may not be possible to do the same with (or even to enumerate effec- 
tively) the axioms of J; that is, ] may not be axiomatic even if K is. This is due 
to the possibility of not being able to determine, at each step, whether or not 
74,4; 1s provable in J,,. 


Exercises 
2.49 Show that a theory K is complete if and only if, for any closed wfs 7 
and vof K, if Fy .#vV % then Fy Zork, 7 


2.50? Prove that every consistent decidable theory has a consistent, decid- 
able, complete extension. 


Definitions 


1. A closed term is a term without variables. 
2. A theory K is a scapegoat theory” if, for any wf .7(x) that has x as its 
only free variable, there is a closed term f such that 


Fx (Ax)7.7(x) > 7. A(t) 


* If a scapegoat theory assumes that a given property B fails for at least one object, then there 
must be a name (that is, a suitable closed term t) of a specific object for which B provably fails. 
So, t would play the role of a scapegoat, in the usual meaning of that idea. Many theories lack 
the linguistic resources (individual constants and function letters) to be scapegoat theories, 
but the notion of scapegoat theory will be very useful in proving some deep properties of first- 
order theories. 
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Lemma 2.15 


Every consistent theory K has a consistent extension K’ such that K’ is a 
scapegoat theory and K’ contains denumerably many closed terms. 


Proof 


Add to the symbols of K a denumerable set {b,, D5, ...} of new individual con- 
stants. Call this new theory K). Its axioms are those of K plus those logical 
axioms that involve the symbols of K and the new constants. K, is consistent. 
For, if not, there is a proof in Ky of a wf .7 A —.%. Replace each b; appearing in 
this proof by a variable that does not appear in the proof. This transforms 
axioms into axioms and preserves the correctness of the applications of the 
rules of inference. The final wf in the proof is still a contradiction, but now 
the proof does not involve any of the bjs and therefore is a proof in K. This 
contradicts the consistency of K. Hence, Ky is consistent. 

By Lemma 2.13, let F,(x;,), Fo(%i,),..-, Fi (Xi,),--. be an enumeration of all wfs 
of K, that have one free variable. Choose a sequence bj,, bj, ... of some of the 
new individual constants such that each b;, is not contained in any of the 
wis F,(xi,),.-., Fe(x;,) and such that bj, is different from each of bj,,..., Di... 
Consider the wf 


(Sx) (Axi, AR (%%,) > 7A (Bj) 


Let K,, be the theory obtained by adding (S,), ..., (S,,) to the axioms of Ko, 
and let K,, be the theory obtained by adding all the (S,)s as axioms to Kp. 
Any proof in K,, contains only a finite number of the (S,)s and, therefore, 
will also be a proof in some K,,. Hence, if all the K,s are consistent, so is 
K,,. To demonstrate that all the K,,s are consistent, proceed by induction. 
We know that K, is consistent. Assume that K,,_, is consistent but that K,, 
is inconsistent (n > 1). Then, as we know, any wf is provable in K,, (by the 
tautology -A => (A = B), Proposition 2.1 and MP). In particular, x, 7(S,). 
Hence, (S,)Fx,, “(S,). Since (S,) is closed, we have, by Corollary 2.7, 
Fx,1 (S:) > -(S,). But, by the tautology (A > =A) > =A, Proposition 2.1 and 
MP, we then have fx,_, -(S,); that is, kk,, 7[(S%;, )7F,(x;, ) => 7F,(b;,)]. Now, 
by conditional elimination, we obtain Fx,_, (4%; )7F,(x;,) and x,_, 77F,(b;,), 
and then, by negation elimination, F,_, F,(b;,). From the latter and the fact 
that b;, does not occur in (59), ..., (S, 1), we conclude fx,_, F,(x,), where x, 
is a variable that does not occur in the proof of F,(b;,). (Simply replace in 
the proof all occurrences of b;, by x,.) By Gen, Fx,_, (Vx;)F,(x,), and then, by 
Lemma 2.11 and biconditional elimination, fx,_, (Vi, )Fi(xi,). (We use the 
fact that F,,(x,) and F,(x;, )are similar.) But we already havetx,., (4%; )7Fi(%i, ), 
which is an abbreviation of kx,_, 7(Vx;, 77, (%;, ), whence, by the replace- 
ment theorem, tx,_, 7(V%i, )F:(xi, ), contradicting the hypothesis that K,,_, is 
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consistent. Hence, K,, must also be consistent. Thus K,, is consistent, it is 
an extension of K, and it is clearly a scapegoat theory. 


Lemma 2.16 


Let J be a consistent, complete scapegoat theory. Then J has a model M whose 
domain is the set D of closed terms of J. 


Proof 


For any individual constant a; of J, let (@)M = a;. For any function letter f;' 
of J and for any closed terms t,, ..., t,, of J, let tie ee wip tal = fe Gir sea ti): 
(Notice that f;'(h,...,t:) is a closed term. Hence, ( oe is an n-ary operation 
on D,) For any predicate letter A; of J, let (Ai) consist of all n-tuples (t,, ..., t,.) 
of closed terms t,, ..., t, of J such that, Ay(h, ..., tn). It now suffices to show 
that, for any closed wf ~ of J: 


(C) Fu ~ if and only if 1 7 


(If this is established and .vis any axiom of J, let ~be the closure of .7 By Gen, 
F, 7. By (D), Em 7. By (VI) on page 58, Fy, .4. Hence, M would be a model of J.) 
The proof of (()) is by induction on the number r of connectives and quanti- 
fiers in 7. Assume that (() holds for all closed wfs with fewer than r connec- 
tives and quantifiers. 


Case 1. ~ is a closed atomic wf Ag(t, ..., t1). Then (L) is a direct consequence 
of the definition of (Az)™. 


Case 2. vis 7. If vis true for M, then 7 is false for M and so, by inductive 
hypothesis, not-F, 7. Since J is complete and 7 is closed, F; 77—that is, F 7. 
Conversely, if 7 is not true for M, then 7 is true for M. Hence, +, 7. Since J is 
consistent, not-F; 7, that is, not, 7. 


Case 3. vis => « Since vis closed, so are “and ~«. If vis false for M, then 7 
is true and .is false. Hence, by inductive hypothesis, F; 7 and not-F, «. By 
the completeness of J, /; >. Therefore, by an instance of the tautology D > 
(cE > -(D = E)) and two applications of MP, F; 7(7 > ~), that is, F; +7, and 
so, by the consistency of J, not-F, 7. Conversely, if not-F; 7, then, by the com- 
pleteness of J, F; 77, that is, F; >(7 > ~). By conditional elimination, F; 7 and 
F, a. Hence, by (() for 7, 7 is true for M. By the consistency of J, not-F; 
and, therefore, by (CL) for «, “is false for M. Thus, since 7is true for M and « 
is false for M, is false for M. 


Case 4. vis (VX,,) 9. 


Case 4a. 7 is a closed wf. By inductive hypothesis, Fy 7 if and only if 
+, 7. By Exercise 2.32(a), +; 7  (Wx,,)7. So, Fy 7 if and only if F(Vx,,) 7, by 
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biconditional elimination. Moreover, Fy, 7 if and only if Fy(Vx,,)7 by prop- 
erty (VI) on page 58. Hence, Fy, “if and only if F, v. 


Case 4b. 7 is not a closed wf. Since ~is closed, 7 has x,, as its only free vari- 
able, say 7is F(x,,). Then vis (Vx,,)F(x,,)- 


i. Assume Fy ¢ and not-F, 7. By the completeness of J, F; +7, that is, 
F, a(Vx,,)F(%,,). Then, by Exercise 2.33(a) and biconditional elimina- 
tion, F,(Ax,,) +F(x,,). Since J is a scapegoat theory, F, +F(t) for some 
closed term t of J. But Fy, 7, that is, F),(Vx,,,)F(x,,). Since (VXx,,)F(x,,) > F® 
is true for M by property (X) on page 59, Fy, F(f). Hence, by (CL) for F(é), 
F(t). This contradicts the consistency of J. Thus, if Fy 7, then, Fy 7. 

ii. Assume F, 7 and not-Fy 7. Thus, 


(#) A (VxXm)E(m) and (HF) not— Fy (VX) F (Xm): 


By (##), some sequence of elements of the domain D does not satisfy (Vx,,) 
F(x,,). Hence, some sequence s does not satisfy F(x,,). Let t be the ith compo- 
nent of s. Notice that s*(u) = u for all closed terms u of J (by the definition of 
(a)M and (f;')“). Observe also that F(t) has fewer connectives and quantifiers 
than ~ and, therefore, the inductive hypothesis applies to F(é), that is, (C) 
holds for F(f). Hence, by Lemma 2(a) on page 60, s does not satisfy F(#). So, F) 
is false for M. But, by (#) and rule A4, F, F(t), and so, by (() for F(), Fy F(). This 
contradiction shows that, if F, 7, then Fy, 7. 

Now we can prove the fundamental theorem of quantification theory. By 
a denumerable model we mean a model in which the domain is denumerable. 


Proposition 2.17* 


Every consistent theory K has a denumerable model. 


Proof 


By Lemma 2.15, K has a consistent extension K’ such that K’ is a scapegoat 
theory and has denumerably many closed terms. By Lindenbaum’s lemma, 
K’ has a consistent, complete extension J that has the same symbols as K’. 
Hence, J is also a scapegoat theory. By Lemma 2.16, J has a model M whose 
domain is the denumerable set of closed terms of J. Since J is an extension of 
K, Mis a denumerable model of K. 


* The proof given here is essentially due to Henkin (1949), as simplified by Hasenjaeger (1953). 
The result was originally proved by Gédel (1930). Other proofs have been published by 
Rasiowa and Sikorski (1951, 1952) and Beth (1951), using (Boolean) algebraic and topologi- 
cal methods, respectively. Still other proofs may be found in Hintikka (1955a,b) and in Beth 
(1959). 
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Corollary 2.18 


Any logically valid wf of a theory K is a theorem of K. 


Proof 


We need consider only closed wfs .4, since a wf is logically valid if and only 
if its closure is logically valid, and /is provable in K if and only if its closure 
is provable in K. So, let ~be a logically valid closed wf of K. Assume that 
not-F, .7. By Lemma 2.12, if we add —.7as a new axiom to K, the new theory 
K’ is consistent. Hence, by Proposition 2.17, K’ has a model M. Since —.7is an 
axiom of K’, —.7is true for M. But, since .7 is logically valid, 7 is true for M. 
Hence, .7is both true and false for M, which is impossible (by (II) on page 57). 
Thus, .4 must be a theorem of K. 


Corollary 2.19 (Godel’s Completeness Theorem, 1930) 


In any predicate calculus, the theorems are precisely the logically valid wfs. 


Proof 


This follows from Proposition 2.2 and Corollary 2.18. (Gédel’s original proof 
runs along quite different lines. For other proofs, see Beth (1951), Dreben 
(1952), Hintikka (1955a,b) and Rasiowa and Sikorski (1951, 1952).) 


Corollary 2.20 


Let K be any theory. 


a. Awf vis true in every denumerable model of K if and only if Fy .%. 

b. If, in every model of K, every sequence that satisfies all wfs in a set 
of wfs also satisfies a wf .4, then Fy 7% 

c. Ifawf.vof Kis a logical consequence of a set I of wfs of K, then Fx... 

d. Ifa wf of K is a logical consequence of a wf “of K, then 7k, .%. 


Proof 


a. We may assume vis closed (Why?). If not-F, .4, then the theory K’ = 
K + {4.4} is consistent, by Lemma 2.12.* Hence, by Proposition 2.17, K’ 
has a denumerable model M. However, —.%, being an axiom of K’, is 
true for M. By hypothesis, since M is a denumerable model of K, .7is 
true for M. Therefore, .7is true and false for M, which is impossible. 


* If Kis a theory and A is a set of wfs of K, then K + A denotes the theory obtained from K by 
adding the wfs of A as axioms. 
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b. Consider the theory K + T° By the hypothesis, 4 is true for every 
model of this theory. Hence, by (a), Fy,p 4% S0, TP FR 2. 


Part (c) is a consequence of (b), and part (d) is a special case of (c). 

Corollaries 2.18—2.20 show that the “syntactical” approach to quantifica- 
tion theory by means of first-order theories is equivalent to the “semantical” 
approach through the notions of interpretations, models, logical validity, 
and so on. For the propositional calculus, Corollary 1.15 demonstrated the 
analogous equivalence between the semantical notion (tautology) and the 
syntactical notion (theorem of L). Notice also that, in the propositional cal- 
culus, the completeness of the system L (see Proposition 1.14) led to a solu- 
tion of the decision problem. However, for first-order theories, we cannot 
obtain a decision procedure for logical validity or, equivalently, for prov- 
ability in first-order predicate calculi. We shall prove this and related results 
in Section 3.6. 


Corollary 2.21 (Skolem—Lowenheim Theorem, 1920, 1915) 
Any theory that has a model has a denumerable model. 


Proof 


If K has a model, then K is consistent, since no wf can be both true and 
false for the same model M. Hence, by Proposition 2.17, K has a denumer- 
able model. 

The following stronger consequence of Proposition 2.17 is derivable. 


Corollary 2.224 


For any cardinal number m2 No, any consistent theory K has a model of car- 
dinality m. 


Proof 


By Proposition 2.17, we know that K has a denumerable model. Therefore, it 
suffices to prove the following lemma. 


Lemma 


If m and n are two cardinal numbers such that m < n and if K has a model 
of cardinality m, then K has a model of cardinality n. 


First-Order Logic and Model Theory 91 


Proof 


Let M be a model of K with domain D of cardinality m. Let D’ be a set of 
cardinality n that includes D. Extend the model M to an interpretation M’ 
that has D’ as domain in the following way. Let c be a fixed element of D. We 
stipulate that the elements of D’ — D behave like c. For example, if Bj is the 
interpretation in M of the predicate letter A’ and (B/) is the new interpreta- 
tion in M’, then for any d,, ..., d, in D’, (B}) holds for (d,, ..., d,) if and only 
if B; holds for (u,, ..., u,), where u; = d; if d; € D and u;=c if d; € D’ —- D. The 
interpretation of the function letters is extended in an analogous way, and 
the individual constants have the same interpretations as in M. It is an easy 
exercise to show, by induction on the number of connectives and quantifiers 
in a wf .% that .vis true for M’ if and only if it is true for M. Hence, M’ is a 
model of K of cardinality n. 


Exercises 


2.51 For any theory K, if! k,x.vand each wf inT is true for a model M of K, 
show that vis true for M. 


2.52 If a wf without quantifiers is provable in a predicate calculus, prove 
that is an instance of a tautology and, hence, by Proposition 2.1, has 
a proof without quantifiers using only axioms (A1)-(A3) and MP. [Hint: 
if 7were not a tautology, one could construct an interpretation, having 
the set of terms that occur in .#as its domain, for which 7 is not true, 
contradicting Proposition 2.2.] 


Note that this implies the consistency of the predicate calculus and 
also provides a decision procedure for the provability of wfs without 
quantifiers. 


2.53 Show that +, if and only if there is a wf 7 that is the closure of the 
conjunction of some axioms of K such that “> vis logically valid. 


2.54 Compactness. If all finite subsets of the set of axioms of a theory K have 
models, prove that K has a model. 


2.55 a. For any wf .%, prove that there is only a finite number of interpreta- 
tions of .7on a given domain of finite cardinality k. 


b. For any wf .%, prove that there is an effective way of determining 
whether 4 is true for all interpretations with domain of some 
fixed cardinality k. 


c. Leta wf be called k-valid if it is true for all interpretations that 
have a domain of k elements. Call .7 precisely k-valid if it is kwalid 
but not (k + 1)-valid. Show that (k + 1)-validity implies k-validity 
and give an example of a wf that is precisely k-valid. (See Hilbert 
and Bernays (1934, § 4-5) and Wajsberg (1933).) 
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2.56 
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Show that the following wf is true for all finite domains but is false for 
some infinite domain. 


(vx)(y)(W2)| AR(x, 2) a( ATW) APY, 2)=> AP2)) (AP )v APY.) | 


2.57 


2.58 


2.59 


2.60 


2.614 


2.624 


2.63 


=> (Sy)(Vx)Ar(y, x) 


Prove that there is no theory K whose models are exactly the interpre- 
tations with finite domains. 


Let .7be any wf that contains no quantifiers, function letters, or indi- 
vidual constants. 


a. Show that a closed prenex wf (Vx,) ... (Vx,,)(4y,) .-. (Ay,,).% with m > 0 
and n > 1, is logically valid if and only if it is true for every interpre- 
tation with a domain of n objects. 


b. Prove that a closed prenex wf (Ay,) ... (Ay,,).7is logically valid if and 
only if it is true for all interpretations with a domain of one element. 


c. Show that there is an effective procedure to determine the logical 
validity of all wfs of the forms given in (a) and (b). 


Let K, and K, be theories in the same language ». Assume that any 
interpretation M of is a model of K, if and only if M is not a model 
of K,. Prove that K, and K, are finitely axiomatizable, that is, there are 
finite sets of sentences I and A such that, for any sentence 7, kx, .7 if 
and only if! .4 and tx, . if and only if AF .2* 


A setI of sentences is called an independent axiomatization of a theory K 

if (a) all sentences in I are theorems of K, (b) [ .#for every theorem .7 

of K, and () for every sentence / of I, it is not the case that T - {7} «* 

Prove that every theory K has an independent axiomatization. 

If, for some cardinal m > No, a wf .7is true for every interpretation of 

cardinality m, prove that .7is logically valid. 

If a wf vis true for all interpretations of cardinality m prove that vis 

true for all interpretations of cardinality less than or equal to m. 

a. Prove that a theory K is a scapegoat theory if and only if, for any wf 
4(x) with x as its only free variable, there is a closed term t such that 
Fy (Ax). A(x) > (0). 

b. Prove that a theory K is a scapegoat theory if and only if, for any wf 
4x) with x as its only free variable such that F, (4x).7(x), there is a 
closed term t such that Fy, .#(£). 


c. Prove that no predicate calculus is a scapegoat theory. 


* Here, an expression + .4, without any subscript attached to +, means that 7 is derivable 
from I using only logical axioms, that is, within the predicate calculus. 
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2.8 First-Order Theories with Equality 


Let K be a theory that has as one of its predicate letters Aj. Let us write t = s 
as an abbreviation for Ai(t,s), and t 4s as an abbreviation for AA; (t,s). Then 
K is called a first-order theory with equality (or simply a theory with equality) if 
the following are theorems of K: 


(A6) (Vx1)x1 = X1 (reflexivity of equality) 
(A7) x=y >(.4(x,x)>.4(x,y)) (substitutivity of equality) 


where x and y are any variables, .(x, x) is any wf, and .4(x, y) arises from 
a(x, x) by replacing some, but not necessarily all, free occurrences of x by y, 
with the proviso that y is free for x in .4(x, x). Thus, .7(x, y) may or may not 
contain free occurrences of x. 

The numbering (A6) and (A7) is a continuation of the numbering of the 
logical axioms. 


Proposition 2.23 


In any theory with equality, 


a. + ¢=¢ for any term f; 
b. Kt=s >s=t for any terms t and s; 
c. Ft=s>(s=r>t=n) for any terms t, s, andr. 


Proof 


a. By (A6), F (Vx,)x, = x,. Hence, by rule A4, F t =. 

b. Let x and y be variables not occurring int or s. Letting .4(x, x) be x = x 
and .4(x, y) be y = x in schema (A7), Fx =y > (x=x > y =). But, 
by (a), F x = x. So, by an instance of the tautology (A > (B > C)) > 
(B > (A= C)) and two applications of MP, we have k x=y > y=x. 
Two applications of Gen yield - (Vx)(Vy)(x = y > y = x), and then two 
applications of rule A4 give kt=s>s=t. 

c. Let x, y, and z be three variables not occurring in f, s, or r. Letting 
ay, y) be y = z and wy, x) be x = z in (A7), with x and y inter- 
changed, we obtain y=x > (y=z => x = 2). But, by (b), F x = 
y > y = x. Hence, using an instance of the tautology (A > B) > 
((B > C) = (A= C)) and two applications of MP, we obtain F x = 
y >(y =z > x = 2). By three applications of Gen, k (Vx)(Vy)(Vz)(x = 
y > (y=z > x = 2), and then, by three uses of rule A4, Ft=s > 
(s=r>te=n). 
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Exercises 


2.64 Show that (A6) and (A7) are true for any interpretation M in which 
(A?)™ is the identity relation on the domain of the interpretation. 


2.65 Prove the following in any theory with equality. 
a. F (Wx). 4X) © (Ay) = y A A(y))) if y does not occur in .7(x) 
F (Wx) 4(xX) & (Vy)(x = y > .4(y))) if y does not occur in .7(x) 
b (vx)(y)x = y 
tx=y => f(x) =f(y), where fis any function letter of one argument 
F axax=y> _A(y), if y is free for x in .7(x) 


moan & 


Fk aXnarztyy)>x Fy, ify is free for x in .7(x) 


We can reduce schema (A7) to a few simpler cases. 


Proposition 2.24 


Let K be a theory for which (A6) holds and (A7) holds for all atomic wfs 
2(x, x) in which there are no individual constants. Then K is a theory with 
equality, that is, (A7) holds for all wfs .7(x, x). 


Proof 


We must prove (A7) for all wfs .7(x, x). It holds for atomic wfs by assump- 
tion. Note that we have the results of Proposition 2.23, since its proof used 
(A7) only with atomic wfs without individual constants. Note also that we 
have (A7) for all atomic wfs .7(x, x). For if .7(x, x) contains individual con- 
stants, we can replace those individual constants by new variables, obtaining 
a wf .7*(x, x) without individual constants. By hypothesis, the correspond- 
ing instance of (A7) with .7*(x, x) is a theorem; we can then apply Gen with 
respect to the new variables, and finally apply rule A4 one or more times to 
obtain (A7) with respect to .7(x, x). 

Proceeding by induction on the number n of connectives and quantifiers in 
2(x, x), we assume that (A7) holds for all k <n. 


Case 1. .4(x, x) is >(x, x). By inductive hypothesis, we have + y = x > (7(x, y) 
=> +(x, x), since ~(x, x) arises from (x, y) by replacing some occurrences of 
y by x. Hence, by Proposition 2.23(b), instances of the tautologies (A > B) > 
(-7=> 7A) and (A => B) > (B>C) => (A => C)) and MP, we obtain x=y> 
(4, x) > A(z, 9). 


Case 2. .2(x, x) is “(x%, xX) > 7 (x, x). By inductive hypothesis and Proposition 
2.23(b), Fx =y > (7x, y) > 7x, 0) and-x=y > (9%, x) > (x, y). Hence, 
by the tautology (A> (C, > C) > [(A>(D3>D)) > AS>(CSD32C> 
D,)))], we have Fx =y > (2(x, x) > 7x, y)). 
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Case 3. A(x, x) is (Vz) 7 (x, x, Z). By inductive hypothesis, F x = y > (“(x%,%,2) > 
7(x, y, 2). Now, by Gen and axiom (A5), F x = y >(Vz) (7 (x, %, 2) > 7% y, 2). 
By Exercise 2.27(a), F (W2)(“(%, x, 2) > 7%, Y, 2) > [W2) (4, x, 2) > V2, DL 
and so, by the tautology (A > B) > (B>C) > (ASOC),Fx=y 3 (40,9) > 
A(x, Y)). 

The instances of (A7) can be still further reduced. 


Proposition 2.25 


Let K be a theory in which (A6) holds and the following are true. 


a. Schema (A7) holds for all atomic wfs .7(x, x) such that no function 
letters or individual constants occur in A(x, x) and .7(x, y) comes 
from .4(x, x) by replacing exactly one occurrence of x by y. 


b.F x=y > fi'(Z, «+, Zn) = fi (M1, ..-, Wn), where f/' is any function 
letter of K, z,, ..., Z, are variables, and f/'(w,,...,W,) arises from 
fi' (21, «++1 Zn) by replacing exactly one occurrence of x by y. 


Then K is a theory with equality. 


Proof 


By repeated application, our assumptions can be extended to replacements 
of more than one occurrence of x by y. Also, Proposition 2.23 is still deriv- 
able. By Proposition 2.24, it suffices to prove (A7) for only atomic wfs without 
individual constants. But, hypothesis (a) enables us easily to prove 


F (Y= Z1A..-A Yn = Zn) D> (AYty oe Wn) > B(Z1y 0-07 Zn) 


for all variables y, ..., Y,, Z, -.., Z, and any atomic wf .A(y,, ..., y,) without 
function letters or individual constants. Hence, it suffices to show: 


(*) If t(x, x) is a term without individual constants and t(x, y) comes from 
t(x, x) by replacing some occurrences of x by y, then x = y > t(x, x) = t(x, y)* 


But (*) can be proved, using hypothesis (b), by induction on the number of 
function letters in t(x,x), and we leave this as an exercise. 

It is easy to see from Proposition 2.25 that, when the language of K has only 
finitely many predicate and function letters, it is only necessary to verify 
(A7) en a finite list of special cases (in fact, n wfs for each Aj and n wfs for 
each f/). 


* The reader can clarify how (*) is applied by using it to prove the following instance of (A7): 
bx=y=>(At(fi(x)) => Al(fi'(y))). Let t(x, x) be fi(x) and let t(x, y) be fi(y). 
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Exercises 
2.66 Let K, be a theory whose language has only = as a predicate letter and 


2.67 


2.68 


no function letters or individual constants. Let its proper axioms be 
(W21)xy = Xy, (VY)(WX Oy = Xq > XQ = Xy), and (VX4)(VX2)(VX3)(Xy = Xp > (Xp = 
X3 => X, = X3)). Show that K, is a theory with equality. [Hint: It suffices 
to prove that F x, = x3 > (%) = %) > x3 =%,) and x, =%; 5 (*4,=%. > 
X1 = X3).] K, is called the pure first-order theory of equality. 

Let K, be a theory whose language has only = and < as predicate letters 
and no function letters or individual constants. Let K, have the follow- 
ing proper axioms. 


(Wx )x, = xy 
(Wx 
(Wx 
(Wx 
(Wx 
(Wx 
(Wx 
(Wx1)(VX_)(Xy < Xq => (Ax5)(X] < X3 A X3 < Xp) 

rae Proposition 2.25, show that K, is a theory with equality. K, is 
called the theory of densely ordered sets with neither first nor last element. 


(Wxp)(X1 = Xp => XQ =X) 
(Wxp)(VX5)(X1 = Xp => (Xp = X3 > X = X5)) 
(AX2)(Ax3)(X, < Xp A X3 < Xy) 
(Wx)(WX3) 
) 
) 


(VX>)(X1 = X> > 7X, < Xp) 


(Wx5)(X1 <X_ AX < Xz > X < Xs) 


) 
) 
) 
) 
) 
) 


(WX>)(%1 < XV Xy =X VX. < X,) 


Pw mo foo S Pp 


R 


Let K be any theory with equality. Prove the following. 

a EX =A. AX = Yn > UXy 0 Ky) = EY «4 Ya), Where HY, «., Y,) 
arises from the term ¢(x,, ..., X,) by substitution of y,, ..., y,, for X1,..., 
X,, respectively. 

be. Fx HY A 2. AX = Vy D> (Ay «0 Xp) S AYr + Vy), where AY, -.., 
y,) is obtained by substituting y,, ..., y,, for one or more occurrences 
of x1, ...,X,, respectively, in the wf .7(x, ...,X,), and y;, ..., y, are free 
for X1, ..., X,, respectively, in the wf .7(x,, ..., X,). 


Examples 


(In the literature, “elementary” is sometimes used instead of “first-order.”) 


1. 


Elementary theory G of groups: predicate letter =, function letter f/, 
and individual constant a,. We abbreviate f/(t,s) byt+s and a, by 0. 
The proper axioms are the following. 


a. Xy + (X) + X3) = (KX, + Xq) + X35 
b. x, +0=%x, 
c. (Wx4)(Ax,)x, + xX, =0 
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d. x =X 
QC. Xp = Xp XH Hy 
fo Xp = Xy > (Xp = Xz > Xy = Xs) 

BX =X > (Xp +.X3 =X + Xg A Xz 4+ X= Hz + Xp) 

That Gis a theory with equality follows easily from Proposition 2.25. 
If one adds to the axioms the following wf: 
h. Xp 4%) =X +X, 

the new theory is called the elementary theory of abelian groups. 

2. Elementary theory F of fields: predicate letter =, function letters fi and 
fz, and individual constants a, and a,. Abbreviate f?(t, s) by t + 5, 
frlt, s) by t-s, and a, and a, by 0 and 1. As proper axioms, take (a)-(h) 
of Example 1 plus the following. 
bX = Xp > (X1- Xg =Xy + Ny AXz° Xp = Xz - Xp) 

y+ (XQ X3) = (Xy - Xp) - Xs 
y+ (Xq + X53) = (y+ XQ) + y+ X3) 


Xp Xp = Xy + Xy 


j. 

k 

1 
m. x,:1=x, 
n. x, #0> (4x,)x,- x, =1 

o. OF1 
F is a theory with equality. Axioms (a)—-(m) define the elementary theory Rc 
of commutative rings with unit. If we add to F the predicate letter Aj, abbre- 
viate A3(t,s) by t <s, and add axioms (e), (f), and (g) of Exercise 2.67, as well 
AS Xy< Xp DS Xp tX3<Xy_ +X, and x, <XAV< x35 X,-X3< X_-X3, then the new 
theory F- is called the elementary theory of ordered fields. 


Exercise 


2.69 a. What formulas must be derived in order to use Proposition 2.25 to 
conclude that the theory G of Example 1 is a theory with equality? 


b. Show that the axioms (d)-(f) of equality mentioned in Example 1 
can be replaced by (d) and 


(fi): X1 = X2 => (x3 = Xp => X1 = Xz). 


One often encounters theories K in which=may be defined; that is, there 
is a wf (x, y) with two free variables x and y, such that, if we abbreviate 
(tf, 8) by t = s, then axioms (A6) and (A7) are provable in K. We make the 
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convention that, if f and s are terms that are not free for x and y, respectively, 
in «(x, y), then, by suitable changes of bound variables (see Exercise 2.48), we 
replace «(x, y) by a logically equivalent wf “*(x, y) such that f and s are free for 
x and y, respectively, in “*(x, y); then f = s is to be the abbreviation of «*(, 5). 
Proposition 2.23 and analogues of Propositions 2.24 and 2.25 hold for such 
theories. There is no harm in extending the term theory with equality to cover 
such theories. 

In theories with equality it is possible to define in the following way phrases 
that use the expression “There exists one and only one x such that...” 


Definition 


— 
| 


Ayx).2(x) for (Ax). a(x) A(Vx)(Vy)-4(x)A.7(y) > x = y) 


In this definition, the new variable y is assumed to be the first variable that 
does not occur in .4(x). A similar convention is to be made in all other defini- 
tions where new variables are introduced. 


Exercise 


2.70 In any theory with equality, prove the following. 
a.  (Wx\(Ay)x = y 

F Aix) 4) © (Ayer = y = 4(y) 

F (Wx) 4X) & (x) > [4ix) 4) & (4) OO) 

F Aix(4v 7) > (Ax). V Gx) 

F yx). 40) > Axis) A (WY4(Y) > Y = %) 


ofan & 


In any model for a theory K with equality, the relation E in the model corre- 
sponding to the predicate letter = is an equivalence relation (by Proposition 
2.23). If this relation E is the identity relation in the domain of the model, 
then the model is said to be normal. 

Any model M for Kcan be contracted to anormal model M* for K by taking the 
domain D* of M* to be the set of equivalence classes determined by the relation 
Ein the domain D of M. Fora predicate letter Aj and for any equivalence classes 
[b,], ..., [b,] in D* determined by elements 0,, ..., b,, in D, we let (A; ) hold for 
([b,], ...,[b,]) ifand only if(A/) holds for (b,, ...,b,). Notice thatit makes no differ- 
ence which representatives b,, ..., b,, we select in the given equivalence classes 
because, from (A7), Fx) = yy A...A Xn = Yn => (Aj (M1, 002 Xn) S ANY, 04, Yn): 
Likewise, for any function letter fj’ and any equivalence classes [bj], ..., 
[b,] in D*, let (f7) (hid, --, Lon) = LCF)" Gn ---, bn) Again note that this is 
independent of the choice of the representatives b,, ..., b,, since, from (A7), 
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we can prove Fx, =YiA...AX%n = Yn => fj (Xe Xn) = fF (Ys) Yn). For any 
individual constant a; let (a)“" = [(a)™]. The relation E* corresponding to = in 
the model M* is the identity relation in D*: E*([b,], [b,]) if and only if E(b,, b,), 
that is, if and only if [D,] = [b,]. Now one can easily prove by induction the 
following lemma: If s = (0, B,, ...) is a denumerable sequence of elements of 
D, and s' = ([b,], [by], ...) is the corresponding sequence of equivalence classes, 
then a wf vis satisfied by s in M if and only if vis satisfied by s’ in M*. It fol- 
lows that, for any wf .4, .7is true for M if and only if .7is true for M*. Hence, 
because M is a model of K, M* is a normal model of K. 


Proposition 2.26 (Extension of Proposition 2.17) 


(Gédel, 1930) Any consistent theory with equality K has a finite or denumer- 
able normal model. 


Proof 


By Proposition 2.17, K has a denumerable model M. Hence, the contraction 
of M to a normal model yields a finite or denumerable normal model M* 
because the set of equivalence classes in a denumerable set D is either finite 
or denumerable. 


Corollary 2.27 (Extension of the Skolem—Lowenheim Theorem) 


Any theory with equality K that has an infinite normal model M has a denu- 
merable normal model. 


Proof 


Add to K the denumerably many new individual constants b,, D5, ... together 
with the axioms b; # b; for i 4 j. Then the new theory K’ is consistent. If K’ 
were inconsistent, there would be a proof in K’ of a contradiction ~ A 77, 
where we may assume that ~ is a wf of K. But this proof uses only a finite 
number of the new axioms: b,, # bi, ..., Bi, # Bj, Now, M can be extended to a 
model M* of K plus the axioms bj, 4 Dj, ..., Bin # Djnr in fact, since M is an infi- 
nite normal model, we can choose interpretations of Dj, Dj, «.., Bin, Din, So that 
the wfs bi, 4 bj, ..., bi, # bj, are true. But, since 7 A 77 is derivable from these 
wfs and the axioms of K, it would follow that ~A 77is true for M*, which is 
impossible. Hence, K’ must be consistent. Now, by Proposition 2.26, K’ has a 
finite or denumerable normal model N. But, since, for i ¥ j, the wfs b; 4 bj are 
axioms of K’, they are true for N. Thus, the elements in the domain of N that 
are the interpretations of b,, bj, ... must be distinct, which implies that the 
domain of N is infinite and, therefore, denumerable. 
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Exercises 


2.71 We define (4,,x).7(x) by induction on n > 1. The case n = 1 has already 
been taken care of. Let (4,,,:%).7(x) stand for (Ay)(A(y) A (4,0) «AY AF 

(x))). 

a. Show that (4,,x).7(x) asserts that there are exactly n objects for 
which .7holds, in the sense that in any normal model for (4,,x).7(x) 
there are exactly n objects for which the property corresponding 
to A(x) holds. 

b. i. For each positive integer n, write a closed wf .4, such that -4, 

is true in a normal model when and only when that model 
contains at least n elements. 


ii. Prove that the theory K, whose axioms are those of the pure 
theory of equality K, (see Exercise 2.66), plus the axioms .4, 
Ay ..., is not finitely axiomatizable, that is, there is no theory 
K’ with a finite number of axioms such that K and K’ have the 
same theorems. 

iii, For anormal model, state in ordinary English the meaning of 
VAs 

c. Letnbea positive integer and consider the wf (,) (4,,x)x =x. Let L,, 

be the theory K, + {4}, where K, is the pure theory of equality. 

i. Show that a normal model M is a model of L,, if and only if 
there are exactly n elements in the domain of M. 

ii. Define a procedure for determining whether any given sen- 
tence is a theorem of L,, and show that L,, is a complete theory. 

2.72 a. Prove that, if a theory with equality K has arbitrarily large finite 

normal models, then it has a denumerable normal model. 
b. Prove that there is no theory with equality whose normal models 
are precisely all finite normal interpretations. 

2.73 Prove that any predicate calculus with equality is consistent. (A predi- 
cate calculus with equality is assumed to have (A1)-(A7) as its only 
axioms.) 

2.74> Prove the independence of axioms (A1)-(A7) in any predicate calculus 
with equality. 

2.75 If.7is a wf that does not contain the = symbol and .7is provable in a 
predicate calculus with equality K, show that .vis provable in K with- 
out using (A6) or (A7). 

2.76" Show that = can be defined in any theory whose language has only a 
finite number of predicate letters and no function letters. 

2.77 a. Find a nonnormal model of the elementary theory of groups G. 
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2.78 


2.79 


2.80 


2.81 


2.82 


b. Show that any model M of a theory with equality K can be 
extended to a nonnormal model of K. [Hint: Use the argument in 
the proof of the lemma within the proof of Corollary 2.22.] 


Let .7be a wf of a theory with equality. Show that vis true in every 
normal model of K if and only if Fx .% 


Write the following as wfs of a theory with equality. 
a. There are at least three moons of Jupiter. 
b. At most two people know everyone in the class. 


c. Everyone in the logic class knows at least two members of the 
geometry class. 


d. Every person loves at most one other person. 


If P(x) means x is a person, A(x, y) means x is a parent of y, G(x, y) means 
xis a grandparent of y, and x = y means x and y are identical, translate the 
following wfs into ordinary English. 


i. (Vx)(P(x) = [(Vy)(G(y, x) = (Aw)(Aly, w) a Aw, x))))) 
ii. (Vx)( P(x) => (4x1) (A%2)(Ax3)(Ax4)(%1 FX2 AX. FXZAXLFEXLA 


Xp #X3 AX. FX4AXZ ¥X4 A G(X1,X)AG(X2,X) A G(x3,X) A 


G(x4,x)A(Wy)(G(y, x) > Y= MV Y HX2VY =X3 VY = X4))) 


Consider the wf 
(*) (Wx)(Vy)4z\(z #xAz# yYAA(zZ)). 


Show that (*) is true in a normal model M of a theory with equality if 
and only if there exist in the domain of M at least three things having 
property A(z). 

Let the language » have the four predicate letters =, P, S, and L. Read 
u=vas uand v are identical, P(u) as u is a point, S(u) as u is a line, and 
Llu, v) as u lies on v. Let the theory of equality G of planar incidence 
geometry have, in addition to axioms (A1)-(A7), the following nonlogi- 
cal axioms. 

1 PX) => -S(x) 

L(x, y) > P(x) A Sy) 

S(x) > GyG2)y #2 A LYy, x) A LG, x) 

PH) APY) AX FY => (AiZ(S@ A L(, Z) A Lly, 2) 

(Axy(Ay)(Az)(PQ) A Ply) A PR) A>7% Y, 2) 


AF ON 
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where + (x, y, z) is the wf (Au)(S(u) A L(x, u) A Liy, u) A L(Z, u)), which is 

read as x, y, z are collinear. 

a. ‘Translate (1)—(5) into ordinary geometric language. 

b. Prove fg (Wu)(Vo)(S(u) A SW) Au # Vv => (Vx)\(Vy)(L(x, u) A L(x, v) A 
L(y, u) A L(y, v) > x = y)), and translate this theorem into ordinary 
geometric language. 


c. Let R(u, v) stand for S(u) A S@) A =(Aw)(L(w, u) A L(w, v)). Read Ru, 0) 
as u and v are distinct parallel lines. 
i. Prove: ke Ru, v) > u#v 
iii Show that there exists a normal model of G with a finite 
domain in which the following sentence is true: 


(Vx)(Vy)S(x) a Ply) A L(y, x) > (iz)(L(y, 2) 4 RZ, x))) 


d. Show that there exists a model of G in which the following sen- 
tence is true: 


(Vx)(Vy)(S(x) A Sy) Ax # y > R(x, y)) 


2.9 Definitions of New Function Letters 
and Individual Constants 


In mathematics, once we have proved, for any y;, ..., y,, the existence of 
a unique object u that has a property .4(u, y, ..., Y,), we often introduce a 
new function letter f(y, ..., y,) such that Af(Yy, --) Yds Yr «+» Yn) holds for all 
Yp +--+ ¥, In cases where we have proved the existence of a unique object u 
that satisfies a wf .4(u) and .4(u) contains u as its only free variable, then we 
introduce a new individual constant b such that .4(b) holds. It is generally 
acknowledged that such definitions, though convenient, add nothing really 
new to the theory. This can be made precise in the following manner. 


Proposition 2.28 


Let K be a theory with equality. Assume that F, (4,u).7(u, 1, ..., y,). Let K* 
be the theory with equality obtained by adding to K a new function letter f 
of n arguments and the proper axiom .A(f(Yp «6, Yds Yr «+ Y* as Well as all 


* It is better to take this axiom in the form (Vu)(U = f(y «4 Yn) > AU Vy «+ Yn), Since f(Yp «+ Yn) 
might not be free for vin 4(U, Yy, «-., Y,)- 
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instances of axioms (A1)-(A7) that involve f. Then there is an effective trans- 
formation mapping each wf 7 of K* into a wf 7* of K such that: 


. If fdoes not occur in 7, then 7* is 7. 
(aA is a (7). 

: (47> aye is7#> GF. 

. (Vx) 7) is (Vx)(7*). 

» (ee 7*). 

f. If Fy, then Fy 7. 


oan owe 


Hence, if ~does not contain fand ,* 7, then Fx 7 


Proof 


By a simple f-term we mean an expression f(t, ..., t,,) in which t,, ..., t, are terms 
that do not contain f: Given an atomic wf / of K*, let 7* be the result of replac- 
ing the leftmost occurrence of asimple term f(t, ..., t,) in 7by the first variable 
v notin vor % Call the wf (Av)(4(@, ty, ..., t,) A 7) the f-transform of 7. If ~does 
not contain f, then let “be its own ftransform. Clearly, Fy#(Sv)( 4, ty, .. ty) A 
“<7. (Here, we use Fx (Au). AU, Yr, -- y,) and the axiom A(f(y, 6 Ys Yur oe 
y,) of K*) Since the f-transform 7’ of 7contains one less f than “and Fy’, 
if we take successive ftransforms, eventually we obtain a wf C* that does not 
contain f and such that Fyy*~. Call ~* the f-less transform of 7. Extend the 
definition to all wfs of K* by letting (7 7)* be 7 (9*)(7>~)* be 7*#>-%, and 
((V x)7)* be (V x). Properties (a)-(e) of Proposition 2.28 are then obvious. 
To prove property (f), it suffices, by property (e), to show that, if ~ does not 
contain fand Fy,47, then Fx 7. We may assume that 7is a closed wf, since a wf 
and its closure are deducible from each other. 

Assume that M is a model of K. Let M, be the normal model obtained by 
contracting M. We know that a wf is true for M if and only if it is true for 
M,. Since Fx (4,u).4(u, Y, ..., y,), then, for any D,, ..., b,, in the domain of M,, 
there is a unique c in the domain of M, such that Fy, .7[c,1,..., b,]. If we 
define f,(b,, ..., b,,) to be c, then, taking f, to be the interpretation of the func- 
tion letter f, we obtain from M, a model M* of K*. For the logical axioms of K* 
(including the equality axioms of K*) are true in any normal interpretation, 
and the axiom 4(f(Yy «-., Yds Ya ++ Y,,) also holds in M* by virtue of the defi- 
nition of f,. Since the other proper axioms of K* do not contain f and since 
they are true for M,, they are also true for M*. But Fy. Therefore, “is true 
for M*, but since 7 does not contain f, “is true for M, and hence also for M. 
Thus, “is true for every model of K. Therefore, by Corollary 2.20(a), x 7. 
(In the case where F (4,v).7(u) and .7(u) contains only u as a free variable, 
we form K* by adding a new individual constant b and the axiom .7(b). Then 
the analogue of Proposition 2.28 follows from practically the same proof as 
the one just given.) 
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Exercise 

2.83 Find the f-less transforms of the following wrfs. 
a. (Vx)(SyAl(x, yy F(X, Yay 00 Yn) > FY, X22 X) =X) 
be ALF Yt Ynts fYtr +1 Yn) A GAT, f Yar Yn) 


Note that Proposition 2.28 also applies when we have introduced several 
new symbols f;, ..., f;, because we can assume that we have added each f; to 
the theory already obtained by the addition of f,, ..., f.; then m successive 
applications of Proposition 2.28 are necessary. The resulting wf ~* of K can be 
considered an (f;, ..., f,)-free transform of “into the language of K. 


Examples 


1. In the elementary theory G of groups, one can prove (4,y)x + y = 0. 
Then introduce a new function f of one argument, abbreviate f(t) by 
(-#), and add the new axiom x + (—x) = 0. By Proposition 2.28, we now 
are not able to prove any wf of G that we could not prove before. 
Thus, the definition of (-f) adds no really new power to the original 
theory. 


2. In the elementary theory F of fields, one can prove that (J,y)((x #0 A 
x-y=1)V (x =0A y=0)). We then introduce a new function letter ¢ 
of one argument, abbreviate g(t) by tf 1, and add the axiom (x #0 Ax- 
x1t=1) Vv (x=0Ax1=0), from which one can provex 40 >x-x1t=1. 


From the statement and proof of Proposition 2.28 we can see that, in theories 
with equality, only predicate letters are needed; function letters and indi- 
vidual constants are dispensable. If f;’ is a function letter, we can replace it 
by a new predicate letter Aj’! if we add the axiom (4,u)Aj*!(u, Y1,---, Yn). AN 


individual constant is to be replaced by a new predicate letter A; if we add 
the axiom (3,u)Aj(u). 


Example 


In the elementary theory G of groups, we can replace + and 0 by predicate 
letters A? and Aj if we add the axioms (Vx1)(Vx2)(4ix3)A?(%1,%2,X3) and 
(4ix1)Ai(x1), and if we replace axioms (a), (b), (), and (g) by the following: 


a’. A} (x2,X3,U) A A? (X1,U,0) A A7(X1,X2,W) A Ai (w,X3,Y) > V=Y 
b’. Ai(y)a Ai(x,y,z) > z=x 
c’. (Ay)(Vu)(Vo)(Al(u) A A? (x, y,0) > 0 =u) 


8B [xy = Xp A ARM, Y,Z) A Al (X2,Y,U) A APY, %1,0) A Al(Y,X2,W)] > z =U 
AV=W 
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Notice that the proof of Proposition 2.28 is highly nonconstructive, since it 
uses semantical notions (model, truth) and is based upon Corollary 2.20(a), 
which was proved in a nonconstructive way. Constructive syntactical proofs 
have been given for Proposition 2.28 (see Kleene, 1952, § 74), but, in general, 
they are quite complex. 

Descriptive phrases of the kind “the u such that 7(u, y,, ..., y,)” are 
very common in ordinary language and in mathematics. Such phrases 
are called definite descriptions. We let w(7(u, Y1, ..., Y,)) denote the unique 
object u such that .7(u, y, ..., y,) if there is such a unique object. If there 
is no such unique object, either we may let w(7(u, 1, ..., y,)) stand for 
some fixed object, or we may consider it meaningless. (For example, we 
may say that the phrases “the present king of France” and “the smallest 
integer” are meaningless or we may arbitrarily make the convention that 
they denote 0.) There are various ways of incorporating these 1-terms in 
formalized theories, but since in most cases the same results are obtained 
by using new function letters or individual constants as above, and since 
they all lead to theorems similar to Proposition 2.28, we shall not discuss 
them any further here. For details, see Hilbert and Bernays (1934) and 
Rosser (1939, 1953). 


2.10 Prenex Normal Forms 


A wf (Qiy;) ... (Q,y,).4%, where each (Qyy;) is either (Vy,) or (Ay,), y; is different 
from y; for 1 # j, and ./contains no quantifiers, is said to be in prenex normal 
form. (We include the case n = 0, when there are no quantifiers at all.) We 
shall prove that, for every wf, we can construct an equivalent prenex nor- 
mal form. 


Lemma 2.29 


In any theory, if y is not free in 7, and ~(x) and ~(y) are similar, then the 
following hold. 


oF (Vx) (x) > 7) & Ay) > 7) 
-F (xr @) > 2) & WY) > 7) 
-F (7 = (Vx)7 (2X) & WWF > ly) 
(7 > (2x) @) & AY > (y) 
F a(Wx) 7 & (Ax) 77 

. F a(ax)7 @ (Wx) a7 


J) 
Z) 


nroanoewe 
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Proof 
For part (a): 
1. (WAX) > 7 Hyp 
2. “(Ay\(“(y) > 7) Hyp 
3. (Vy) A(“(y) > 7) 2, abbreviation 
4. (Wy) Aly) > Y) 3, negation elimination 
5. (Wy)(“(y) A 77) 4, tautology, Proposition 2.9(c) 
6. AY) AZ 5, rule A4 
7. 7(y) 6, conjunction elimination 
8. (Vy)~(y) 7,Gen 
9. (Wx) 7 (x) 8, Lemma 2.11, Biconditional elimination 
10. 7 1,9, MP 
11. ag 6, conjunction elimination 
12. DNAG 10, 11, conjunction introduction 
13. (Vx)7(x) > Y, 1-12 
aGy)(7~{y) > AYE INA 
14. (Wx) (x) > 7 1-13, proof by contradiction 
F (ay)(-(y) > 7) 
15. F (Wx)7(x) > 1-14, Corollary 2.6 


7 => (Ay(-(y) > 7) 


The converse is proven in the following manner. 


1. Gy(v(y) > 7) Hyp 
2. (Wx)7(x) Hyp 
3. ob) > 9 1, rule C 
4. 7(b) 2,rule A4 
5. 9 3, 4, MP 
6. (Ay)(“(y) > 7), WIN) Fe 1-5 
. (Ay\(“(y) > 2), WAX) F 6, Proposition 2.10 


ON 


F (Ay)\(“(y) > 7) > (Wx)-(x) > 7) ~——-1-+7, Corollary 2.6 twice 


Part (a) follows from the two proofs above by biconditional introduction. 
Parts (b)—(f) are proved easily and left as an exercise. (Part (f) is trivial, and 
(e) follows from Exercise 2.33(a); (c) and (d) follow easily from (b) and (a), 
respectively.) 

Lemma 2.29 allows us to move interior quantifiers to the front of a wf. This 
is the essential process in the proof of the following proposition. 


Proposition 2.30 


There is an effective procedure for transforming any wf .7 into a wf 7 in 
prenex normal form such that k .7© ~. 
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Proof 


We describe the procedure by induction on the number k of occurrences of 
connectives and quantifiers in .7. (By Exercise 2.32(a,b), we may assume that 
the quantified variables in the prefix that we shall obtain are distinct, If k= 0, then 
let ~ be vitself. Assume that we can find a corresponding + for all wfs with 
k <n, and assume that .vhas n occurrences of connectives and quantifiers. 


Case 1. If vis —7, then, by inductive hypothesis, we can construct a wf « 
in prenex normal form such that F 7 @ « Hence, k 77 & 7« by bicondi- 
tional negation. Thus, + 7 7, and, by applying parts (e) and (f) of Lemma 
2.29 and the replacement theorem (Proposition 2.9(b)), we can find a wf vin 
prenex normal form such that F >“ v~. Hence, 7 +. 


Case 2. If vis 7 => «, then, by inductive hypothesis, we can find wfs “™ and 
“in prenex normal form such that 7@ % and «© 4. Hence, by a suit- 
able tautology and MP, (7> 4) @(%> 4), that is, - 7@(% => 4). Now, 
applying parts (a)—-(d) of Lemma 2.29 and the replacement theorem, we can 
move the quantifiers in the prefixes of 7, and “ to the front, obtaining a wf 7 
in prenex normal form such that F 7 <. 


Case 3. If wis (Wx)%, then, by inductive hypothesis, there is a wf “ in prenex 
normal form such thatk 7@ ~%; hence, .7<@ (Vx) 4 by Gen, Lemma 2.8, and 
MP. But (Vx) is in prenex normal form. 


Examples 
1. Let 7 be (Wx)(Ai(x) > (Vy)(Ai(x,y) > “(V2)(A3(y,2))). By part ©) 
of Lemma 2.29: (Vx)(Ai(x) > (Vy)[Ai(x, y) = (Az)nA3(y,z))). 
By part (d): (Vx)(Ai(x) => (Vy)(4u)[A3(x, y) => AA3(y,u)]). 
By part (0): (Vx)(Vo)(Ai(x) = (Au)[A2(x,2) > A3(0, u)]). 
By part (d): (Vx)(Vv)(Sw)(Ai(x) = (A2(x, 0) > =A3(0, w))). 
Changing bound variables: (Vx)(Vy)(4z)(Ai(x) > (A3(x, y) => AA5(y,2)))- 
2. Let. zbe Ai (x, y) > (Ay)[Ai(y) > (Gx) Ai(x)] > A3(y))]. 
By part (b): Ai(x, y) => y)(At(y) => (Wu)[Ai(u) > Ax(y))). 
By part (): Ai (x,y) => (Ay)(Vo)(Ai(y) = [Ai@) = Aa(y)))- 
By part (d): (Aw)(At(x, y) > (Vv)[Ai(w) = (Ai(v) > A3(w)))). 
By part (0): (Aw)(Vz)(Ar(x, y) => [Ai(w) = (Al(z) => A2(w))). 


Exercise 
2.84 Find prenex normal forms equivalent to the following wfs. 


a. [(Vx)(Ai(x) > Ar(x,y))]1=> (Ey)Ai(y)] = (z)Ar(y,2)) 
b. (Ax)A?(x, y) > (Al(x) > AGu)At(x,u)) 
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A predicate calculus in which there are no function letters or individual 
constants and in which, for any positive integer n, there are infinitely many 
predicate letters with n arguments, will be called a pure predicate calculus. 
For pure predicate calculi we can find a very simple prenex normal form 
theorem. A wf in prenex normal form such that all existential quantifiers 
(if any) precede all universal quantifiers (if any) is said to be in Skolem 
normal form. 


Proposition 2.31 


Ina pure predicate calculus, there is an effective procedure assigning to each 
wf .zanother wf .y in Skolem normal form such that F .7 if and only if Fv 
(or, equivalently, by Gédel’s completeness theorem, such that .7 is logically 
valid if and only if .y is logically valid). 


Proof 


First we may assume that .7is a closed wf, since a wf is provable if and only 
if its closure is provable. By Proposition 2.30 we may also assume that .7is in 
prenex normal form. Let the rank r of .7be the number of universal quanti- 
fiers in 7 that precede existential quantifiers. By induction on the rank, we 
shall describe the process for finding Skolem normal forms. Clearly, when 
the rank is 0, we already have the Skolem normal form. Let us assume that 
we can construct Skolem normal forms when the rank is less than r, and let r 
be the rank of .% .7can be written as follows: (Ay,) ... (Ay,) WW)7(Yy, «Yr YW), 
where ¢ (Y;, .--, Yn, U) has only y;, ..., Y,, U as its free variables. Let A?*' be the 
first predicate letter of n + 1 arguments that does not occur in «4. Construct 
the wf 


(A) By) Ayn UO) Yayo Yur) => AP Ya oe Yr) 
=> (WU)AP (Yt, ..-, Yn U)) 


Let us show that  .vif and only if F .4. Assume F .4. In the proof of .4, 
replace all occurrences of APN ess, Zn, W) by 7*(Z, .., Z,, W)), where 7* is 
obtained from ~ by replacing all bound variables having free occurrences 
in the proof by new variables not occurring in the proof. The result is a 
proof of 


(Sy1).. Gyn (Wu) (4 (Yt, --- Yn UW) > OY,» Yn U))) 
=> (Vue *(Y1,---7 Yn U)) 
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(~* was used instead of 7 so that applications of axiom (A4) would remain 
applications of the same axiom.) Now, by changing the bound variables back 
again, we see that 


+ (Ayr). Ayn CVU(4 (Yay or Yur L) > (Yay oer Yr W)) 
=> (VU) (YL --- Yn W)] 


Since F (Wu)(4 (Yp es Yr U) > OY «7 Var U)), We obtain, by the replace- 
ment theorem, - (4y,) ... (Ay,)(Wu)7 (YW, --, Ya U), that is, F .7 Conversely, 
assume that + .v By rule C, we obtain (Vu)7 (b, ..., b,, u). But, F (Wu)7 > 
(Wul(7 > @ => (Wu)e) (See Exercise 2.27 (a)) for any wfs 7 and « Hence, 
Ke (Wuy(e (D1, --0 Dn, U) > AM Mb... bn U))>(WUAT (by 1, Dy u). So, 


by rule E4, te (Ay)... (Ayn)((Wu)(4 (di. Dn, UW) => AM (Yrs Yur US 


(Wu)AN (yi, .-+7 Yn, U)), that is, Fe .4. By Proposition 2.10, F .4. A prenex nor- 
mal form of .4 has the form .4: (Ay,) ... (Ay,,) (Au)(Q,2:) ... (Q,z,)(Vv)z, where 
¢ has no quantifiers and (Q,z,) ... (Q,z,) is the prefix of ~. [In deriving the 
prenex normal form, first, by Lemma 2.29(a), we pull out the first (Vu), which 
changes to (Au); then we pull out of the first conditional the quantifiers in 
the prefix of 7. By Lemma 2.29(a,b), this exchanges existential and universal 
quantifiers, but then we again pull these out of the second conditional of .4, 
which brings the prefix back to its original form. Finally, by Lemma 2.29(c), 
we bring the second (Vu) out to the prefix, changing it to a new quantifier 
(Vv).] Clearly, .4 has rank one less than the rank of .7 and, by Proposition 
2.30, .4 @.4. But, vif and only if .4. Hence, .vif and only if .4. By 
inductive hypothesis, we can find a Skolem normal form for 4, which is also 
a Skolem normal form for .7% 


— 


Example 


Zz (Wx)(Vy)(Az) (x, y, 2), where “contains no quantifiers 
A (Wx)((Wy)(Az)7 (x, y, 2) => Aj(x)) > (Vx)Aj(x), where Aj is not in 7. 


We obtain the prenex normal form of ./: 


(3x)([ (¥y)(Bz)- (x,y,z) => A}(x) ]=> (Vx) Aj(2)) 2.29(a) 
(Ax)(y)[[G2)- (x,y, 2) => Al) ]=> (WAI) | 2.29(a) 
(Bx)(Gy)(v2)[ «(x y,2) => Al(x) |= (Wx) 4)(x)) 2.29(b) 
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(ax)(vy)] (W2)(¢ (x, y,2) > A}(x)) > (Wx) 4} (x)| 2.29(b) 
(Ax)(vy)(32)([« (x,y,z) => Aj(x) |=>(Wx)A] (x) 2.29(a) 
(Ax)(Wy)Az)(V0)| (7 (x,y,2) > Al(x)) > Aj(o) | 2.29(c) 


We repeat this process again: Let (x, y, z, 0) be (7 (x,y,z) > Aj(x)) > Aj(o). 
Let Aj not occur in 7 Form: 


(x)([(vGaevoy v.20) => AR) ]] => (WARY) 


(Ax)(Sy)IL(Sz)(V0)( 9 (x, y,2,2)) > Ar(x WIS (WY)AR(x,y)]_— 2.29(a) 


(Ax)(Ay)(32z)Go)(I(4 (x,y, 2,0) > Ak(x,y)]> (Wy)Ar(x,y))  2.29(a,b) 


(Ax)(@y)(Az)(Vo)(Vw)(L9 (x, y,Z,0) > Az(x, y)]=> Ax(x, w)] 2.29(c) 
Thus, a Skolem normal form of .7is: 


(Ax)(3y)(Gz)(Voy(Vw) (+ (x, y, 2) => Aj(x)) > Aj (0) => Ak(x, y= Ar (x,w)) 


Exercises 


2.85 Find Skolem normal forms for the following wfs. 

a. A(ax)Al(x) = (Vu) y)(Vx)AR(u, x,y) 
b. (Wx)(Ay)(Wu)Go)Ai(x, y, u,2) 

2.86 Show that there is an effective procedure that gives, for each wf 7 
of a pure predicate calculus, another wf 7 of this calculus of the form 
(Vy,) ... (Wy,)(4z,) ... (dz,,)~% such that “is quantifier-free, n, m > 0, and 
4 is satisfiable if and only if / is satisfiable. [Hint: Apply Proposition 
2.31 to 7.4] 

2.87 Find a Skolem normal form for (Vx)(Sy)Ai(x,y) and show that it 
is not the case that + v <= (Vx)(Sy)At(x,y). Hence, a Skolem normal 
form for a wf .7is not necessarily logically equivalent to .4, in contra- 
distinction to the prenex normal form given by Proposition 2.30. 
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2.11 Isomorphism of Interpretations: Categoricity of Theories 


We shall say that an interpretation M of some language ~ is isomorphic with 
an interpretation M* of » if and only if there is a one-one correspondence g 
(called an isomorphism) of the domain D of M with the domain D* of M* 
such that: 


1. Forany predicate letter Ay of y and forany b,,...,b, inD,FyAj[b, Dy | 
if and only if Ry Aj[g(bi),..., 9(bn)I- 
2. For any function letter fj/' of » and for any by, ..., D 
HF) (Bi, «27 On) = CF) (Gb), +7 GOn))- 
a 


3. For any individual constant a; of 7, g(a) = (a; 


The notation M ~ M* will be used to indicate that M is isomorphic with M*. 
Notice that, if M ~ M* then the domains of M and M* must be of the same 
cardinality. 


Proposition 2.32 


If g is an isomorphism of M with M*, then: 


a. for any wf .7 of 1, any sequence s = (b,, by, ...) of elements of the 
domain D of M, and the corresponding sequence g(s) = (g(0,), 
G(b,), ...), 8 satisfies .7in M if and only if g(s) satisfies 7in M*; 

b. hence, Fy, .7if and only if Fyp «7 


Proof 


Part (b) follows directly from part (a). The proof of part (a) is by induction 
on the number of connectives and quantifiers in and is left as an exercise. 

From the definition of isomorphic interpretations and Proposition 2.32 we 
see that isomorphic interpretations have the same “structure” and, thus, dif- 
fer in no essential way. 


Exercises 


2.88 Prove that, if Mis an interpretation with domain D and D* isa set that 
has the same cardinality as D, then one can define an interpretation M* 
with domain D* such that M is isomorphic with M*. 

2.89 Prove the following: (a) M is isomorphic with M. (b) If M, is isomorphic 
with M,, then M, is isomorphic with M,. (c) If M, is isomorphic with M, 
and M, is isomorphic with M;, then M, is isomorphic with M3. 
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A theory with equality K is said to be m—categorical, where m is a car- 
dinal number, if and only if: any two normal models of K of cardinality m 
are isomorphic, and K has at least one normal model of cardinality m (see 
LoS, 1954c). 


Examples 


1. 


Let K? be the pure theory of equality K, (see page 96) to which has 


been added axiom (E2): (Ax,)(Ax.)(x, # xX A (WX3)(X3 = X1 V X3 = Xp). 


Then K? is 2-categorical. Every normal model of K* has exactly two 
elements. More generally, define (E,,) to be: 


Gn)..Gro{ a.m #xj A(VY)(YHMV...VY= x,)] 


where Aj<icjen Xj # X; is the conjunction of all wfs x; # x; with 1 <i <j 
<n. Then, if K" is obtained from K, by adding (E,) as an axiom, K" is 
n-categorical, and every normal model of K” has exactly n elements. 


. The theory K, (see page 96) of densely ordered sets with neither first 


nor last element is %,—categorical (6ee Kamke, 1950, p. 71: every denu- 
merable normal model of K, is isomorphic with the model consisting 
of the set of rational numbers under their natural ordering). But one 
can prove that K, is not m-categorical for any m different from No. 


Exercises 


2.904 


2.914 


2.92 


Find a theory with equality that is not &)—categorical but is m—categori- 
cal for all m > X,. [Hint: Consider the theory Gc. of abelian groups 
(see page 96). For each integer n, let ny stand for the term (y + y) + + + y 
consisting of the sum of n ys. Add to Gc the axioms (4):(Vxy(ay)ry = x) 
for all n > 2. The new theory is the theory of uniquely divisible abelian 
groups. Its normal models are essentially vector spaces over the field 
of rational numbers. However, any two vector spaces over the rational 
numbers of the same nondenumerable cardinality are isomorphic, and 
there are denumerable vector spaces over the rational numbers that are 
not isomorphic (see Bourbaki, 1947).] 


Find a theory with equality that is m—categorical for all infinite cardi- 
nals m. [Hint: Add to the theory Ge, of abelian groups the axiom (Vx,) 
(2x, = 0). The normal models of this theory are just the vector spaces 
over the field of integers modulo 2. Any two such vector spaces of the 
same cardinality are isomorphic (see Bourbaki, 1947).] 


Show that the theorems of the theory K" in Example 1 above are pre- 
cisely the set of all wfs of K” that are true in all normal models of 
cardinality n. 
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2.934 Find two nonisomorphic densely ordered sets of cardinality 2° 
with neither first nor last element. (This shows that the theory K, of 
Example 2 is not 2*°-categorical.) 


Is there a theory with equality that is m—categorical for some noncountable 
cardinal m but not m-categorical for some other noncountable cardinal 
n? In Example 2 we found a theory that is only No-categorical; in Exercise 
2.90 we found a theory that is m-categorical for all infinite m > NX) but not 
N,-categorical, and in Exercise 2.91, a theory that is m—categorical for any 
infinite m. The elementary theory G of groups is not m-categorical for 
any infinite m. The problem is whether these four cases exhaust all the 
possibilities. That this is so was proved by Morley (1965). 


2.12 Generalized First-Order Theories: 
Completeness and Decidability* 


If, in the definition of the notion of first-order language, we allow a non- 
countable number of predicate letters, function letters, and individual con- 
stants, we arrive at the notion of a generalized first-order language. The notions 
of interpretation and model extend in an obvious way to a generalized first- 
order language. A generalized first-order theory in such a language is obtained 
by taking as proper axioms any set of wfs of the language. Ordinary first- 
order theories are special cases of generalized first-order theories. The reader 
may easily check that all the results for first-order theories, through Lemma 
2.12, hold also for generalized first-order theories without any changes in 
the proofs. Lemma 2.13 becomes Lemma 2.13’: if the set of symbols of a gen- 
eralized theory K has cardinality Na, then the set of expressions of K also 
can be well-ordered and has cardinality Xa. (First, fix a well-ordering of the 
symbols of K. Second, order the expressions by their length, which is some 
positive integer, and then stipulate that if e, and e, are two distinct expres- 
sions of the same length k, and j is the first place in which they differ, then e, 
precedes e, if the jth symbol of e, precedes the jth symbol of e, according to the 
given well-ordering of the symbols of K.) Now, under the same assumption 
as for Lemma 2.13’, Lindenbaum’s Lemma 2.14’ can be proved for generalized 
theories much as before, except that all the enumerations (of the wfs .4 and of 
the theories J;) are transfinite, and the proof that J is consistent and complete 
uses transfinite induction. The analogue of Henkin’s Proposition 2.17 runs 
as follows. 


* Presupposed in parts of this section is a slender acquaintance with ordinal and cardinal 
numbers (see Chapter 4; or Kamke, 1950; or Sierpinski, 1958). 
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Proposition 2.33 


If the set of symbols of a consistent generalized theory K has cardinality &,, 
then K has a model of cardinality X,. 


Proof 


The original proof of Lemma 2.15 is modified in the following way. Add 
X&, new individual constants b,, b,, ..., b,, .... As before, the new theory 
K, is consistent. Let F,(x;,),...,F,(%i,),...(A<@,) be a sequence consist- 
ing of all wfs of K, with exactly one free variable. Let (S,) be the sentence 
(Ax;, AB (xi, ) > AF,(b;,), where the sequence bj,, b;,, ...b;,,... of distinct indi- 
vidual constants is chosen so that b;, does not occur in F;(x;,) for B < A. The 
new theory K,,, obtained by adding all the wfs (S,) as axioms, is proved to 
be consistent by a transfinite induction analogous to the inductive proof 
in Lemma 2.15. K,, is a scapegoat theory that is an extension of K and con- 
tains &, closed terms. By the extended Lindenbaum Lemma 2.14’, K,, can 
be extended to a consistent, complete scapegoat theory J with &, closed 
terms. The same proof as in Lemma 2.16 provides a model M of J of cardi- 
nality &,. 


Corollary 2.34 


a. If the set of symbols of a consistent generalized theory with equality 
K has cardinality &,, then K has a normal model of cardinality less 
than or equal to X,. 

b. If, in addition, K has an infinite normal model (or if K has arbitrarily 
large finite normal models), then K has a normal model of any cardi- 
nality N, > Xu 

. In particular, if K is an ordinary theory with equality (Le., &, = Xo) 
and K has an infinite normal model (or if K has arbitrarily large 
finite normal models), then K has a normal model of any cardinality 
&,(B = 0). 


io) 


Proof 


a. The model guaranteed by Proposition 2.33 can be contracted to a 
normal model consisting of equivalence classes in a set of cardinal- 
ity &,. Such a set of equivalence classes has cardinality less than or 
equal to X,. 

b. Assume &, > X,. Let bj, by, ... be a set of new individual constants of 
cardinality &, and add the axioms b, # b, for 4 # p. As in the proof 
of Corollary 2.27, this new theory is consistent and so, by (a), has a 
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normal model of cardinality less than or equal to &, (since the new 
theory has &, new symbols). But, because of the axioms b, # b,, the 
normal model has exactly &, elements. 

c. This is a special case of (b). 


Exercise 


2.94 If the set of symbols of a predicate calculus with equality K has 
cardinality X,, prove that there is an extension K’ of K (with the same 
symbols as K) such that K’ has a normal model of cardinality &,, but K’ 
has no normal model of cardinality less than &,,. 


From Lemma 2.12 and Corollary 2.34(a,b), it follows easily that, if a gen- 
eralized theory with equality K has &, symbols, is X,-categorical for some 
6 >a, and has no finite models, then K is complete, in the sense that, for any 
closed wf .4, either Fx .vor Fx 7.4 (Vaught, 1954). If not-F, .zand not-F,y 7%, 
then the theories K’ = K + {>.4} and K” = K + {.4} are consistent by Lemma 
2.12, and so, by Corollary 2.34(a), there are normal models M’ and M” of K’ 
and K’, respectively, of cardinality less than or equal to &,. Since K has no 
finite models, M’ and M” are infinite. Hence, by Corollary 2.34(b), there are 
normal models N’ and N” of K’ and K’, respectively, of cardinality N,. By the 
N,-categoricity of K, N’ and N” must be isomorphic. But, since + 7is true in 
N’ and vis true in N”, this is impossible by Proposition 2.32(b). Therefore, 
either Fy Zor Fy 77%. 

In particular, if K is an ordinary theory with equality that has no 
finite models and is &,-categorical for some B > 0, then K is complete. 
As an example, consider the theory K, of densely ordered sets with nei- 
ther first nor last element (see page 96). K, has no finite models and is 
N,-categorical. 

If an ordinary theory K is axiomatic (i.e., one can effectively decide whether 
any wf is an axiom) and complete, then K is decidable, that is, there is an 
effective procedure to determine whether any given wf is a theorem. To see 
this, remember (see page 84) that if a theory is axiomatic, one can effectively 
enumerate the theorems. Any wf .7is provable if and only if its closure is 
provable. Hence, we may confine our attention to closed wfs .4. Since K is 
complete, either .7is a theorem or —.7is a theorem, and, therefore, one or the 
other will eventually turn up in our enumeration of theorems. This provides 
an effective test for theoremhood. Notice that, if K is inconsistent, then every 
wf is a theorem and there is an obvious decision procedure; if K is consistent, 
then not both and —%can show up as theorems and we need only wait until 
one or the other appears. 

If an ordinary axiomatic theory with equality K has no finite models and is 
N,-categorical for some f > 0, then, by what we have proved, K is decidable. 
In particular, the theory K, discussed above is decidable. 
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In certain cases, there is a more direct method of proving completeness 
or decidability. Let us take as an example the theory K, of densely ordered 
sets with neither first nor last element. Langford (1927) has given the fol- 
lowing procedure for K,. Consider any closed wf .4% By Proposition 2.30, 
we can assume that .7 is in prenex normal form (Qiy;) ... (Q,y,,)~ where 
, contains no quantifiers. If (Q,y,) is (Vy,,), replace (Vy,)~ by 7(4y,)7~ In 
all cases, then, we have, at the right side of the wf, (4y,,) 7, where 7 has no 
quantifiers. Any negation x # y can be replaced by x <y Vy < x, and 7(x < y) 
can be replaced by x = y V y < x. Hence, all negation signs can be eliminated 
from 7%. We can now put / into disjunctive normal form, that is, a disjunc- 
tion of conjunctions of atomic wfs (see Exercise 1.42). Now (4y,)(% V % V 
... VG) is equivalent to (Sy,,) % V (Ay,) V ... V (Ay,,) 4 Consider each (Ay,,) 
Y; separately. 7; is a conjunction of atomic wfs of the form t <s and t = s. If 
7; does not contain y,, just erase (Ay,,). Note that, if a wf “does not contain 
y,, then (Ay,)(“A .7) may be replaced by “A (Ay,,).4 Hence, we are reduced 
to the consideration of (4y,,).4 where .7is a conjunction of atomic wfs of the 
form ft < s or t = s, each of which contains y,,. Now, if one of the conjuncts is 
y, = Zz for some z different from y,,, then replace in .7all occurrences of y,, by 
z and erase (Ay,,). If we have y,, = y, alone, then just erase (Ay,,). If we have 
Y, = Y, aS One conjunct among others, then erase y, = y,,. If .7has a conjunct 
Yn < Y,, then replace all of (Ay,).7 by y, < y,. If 7 consists of y,, <2, A... AY, 
<Zj AU, <Y,A +) A Um < Yn, then replace (Ay,,).7 by the conjunction of all the 
wfs u; < z, for1 <i<mand1 <p <j. If all the us or all the z,s are missing, 
replace (Ay,,).7 by y,, = y,. This exhausts all possibilities and, in every case, 
we have replaced (Ay,,).7 by a wf containing no quantifiers, that is, we have 
eliminated the quantifier (Ay,,). We are left with (Q,y,) ... (Q,-1Y,-1)% where 7 
contains no quantifiers. Now we apply the same procedure successively to 
(Qi-Wn-1), «++, (Qyy;). Finally we are left with a wf without quantifiers, built 
up of wfs of the form x = x and x < x. If we replacex =xbyx=x>x=x 
and x <x by -(x =x => x =x), the result is either an instance of a tautology 
or the negation of such an instance. Hence, by Proposition 2.1, either the 
result or its negation is provable. Now, one can easily check that all the 
replacements we have made in this whole reduction procedure applied to 7 
have been replacements of wfs 7 by other wfs vsuch that Fy 7 ~% Hence, 
by the replacement theorem, if our final result is provable, then so is the 
original wf .%, and, if =.vis provable, then so is —.% Thus, K, is complete and 
decidable. 

The method used in this proof, the successive elimination of existential 
quantifiers, has been applied to other theories. It yields a decision procedure 
(see Hilbert and Bernays, 1934, §5) for the pure theory of equality K, (see 
page 96). It has been applied by Tarski (1951) to prove the completeness and 
decidability of elementary algebra (i.e., of the theory of real-closed fields; see 
van der Waerden, 1949) and by Szmielew (1955) to prove the decidability of 
the theory G; of abelian groups. 
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Exercises 


2.95 (Henkin, 1955) If an ordinary theory with equality K is finitely axi- 
omatizable and &,-categorical for some a, prove that K is decidable. 


2.96 a. Prove the decidability of the pure theory K, of equality. 


b. Give an example of a theory with equality that is &,-categorical for 
some a, but is incomplete. 


2.12.1 Mathematical Applications 


1. Let F be the elementary theory of fields (see page 96). We let n 
stand for the term 1+1+-- + 1, consisting of the sum of n 1s. Then 
the assertion that a field has characteristic p can be expressed by 
the wf 7,: p = 0. A field has characteristic 0 if and only if it does 
not have characteristic p for any prime p. Then for any closed wf 

4 of F that is true for all fields of characteristic 0, there is a prime 
number gq such that .7is true for all fields of characteristic greater 
than or equal to q. To see this, notice that, if Fy is obtained from 
F by adding as axioms 7%, 77%, ..., 7%, «.. (for all primes p), the 
normal models of Fy are the fields of characteristic 0. Hence, by 
Exercise 2.77, f, 47. But then, for some finite set of new axioms 
AWayr A Cqap 0+ Way WE have 7%, 7%, +++, 7, er 4. Let g be a prime 
greater than all q,, ..., q,, In every field of characteristic greater 
than or equal to q, the wfs 774,, 77, ..., 7, are true; hence, .7is also 
true. (Other applications in algebra may be found in A. Robinson 
(1951) and Cherlin (1976).) 


2. A graph may be considered as a set with a symmetric binary rela- 
tion R (ie., the relation that holds between two vertices if and 
only if they are connected by an edge). Call a graph k-colorable 
if and only if the graph can be divided into k disjoint (possibly 
empty) sets such that no two elements in the same set are in the 
relation R. (Intuitively, these sets correspond to k colors, each color 
being painted on the points in the corresponding set, with the pro- 
viso that two points connected by an edge are painted different 
colors.) Notice that any subgraph of a k-colorable graph is k-color- 
able. Now we can show that, if every finite subgraph of a graph 7 
is k-colorable, and if ~ can be well-ordered, then the whole graph 
cis k-colorable. To prove this, construct the following generalized 
theory with equality K (Beth, 1953). There are two binary predi- 
cate letters, Aj(=) and Aj (corresponding to the relation R on <); 
there are k monadic predicate letters Al,..., Ai (corresponding to 
the k subsets into which we hope to divide the graph); and there 
are individual constants a,, one for each element c of the graph <. 
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As proper axioms, in addition to the usual assumptions (A6) and 
(A7), we have the following wfs: 


I. 4A3(x, x) (irreflexivity of R) 
IL. A3(x,y) => As(y,x) (symmetry of R) 
IIL. (Vx)(Ai(x) v Ax(x)v...v Az(x)) (division into k classes) 
IV. (Vx) A(Ai(x) A Aj(x)) (disjointness of the k classes) 
forl <i<j<k 
V. (Vx) (Vy)(Ai (x) A Ai (y) > (two elements of the same 
—=A3(x,y)) for 1<i<k class are not in the relation R) 
VI. a, #4, for any two distinct elements b 
and c of @ 
VII. AZ(ap,4-), , if R(b, c) holds in 7 


Now, any finite set of these axioms involves only a finite number of the indi- 
vidual constants 4,,,..., 4,,, and since the corresponding subgraph {c;, ..., C,} 
is, by assumption, k-colorable, the given finite set of axioms has a model 
and is, therefore, consistent. Since any finite set of axioms is consistent, K 
is consistent. By Corollary 2.34(a), K has a normal model of cardinality less 
than or equal to the cardinality of v. This model is a k-colorable graph and, 
by (VI)-(VI), has “as a subgraph. Hence 7 is also k-colorable. (Compare this 
proof with a standard mathematical proof of the same result by de Bruijn and 
Erdés (1951). Generally, use of the method above replaces complicated appli- 
cations of Tychonoff’s theorem or Kénig’s Unendlichkeits lemma.) 


Exercises 


2.974 (Los, 1954b) A group B is said to be orderable if there exists a binary 
relation R on B that totally orders B such that, if xRy, then (x + Z) 
R(y + z) and (z + x)R(@ + y). Show, by a method similar to that used 
in Example 2 above, that a group B is orderable if and only if every 
finitely generated subgroup is orderable (if we assume that the set B 
can be well-ordered). 


2.984 Set up a theory for algebraically closed fields of characteristic p(> 0) by 
adding to the theory F of fields the new axioms P,,, where P,, states that 
every nonconstant polynomial of degree 1 has a root, as well as axioms 
that determine the characteristic. Show that every wf of F that holds for 
one algebraically closed field of characteristic 0 holds for all of them. 
[Hint: This theory is 8$-categorical for B > 0, is axiomatizable, and has 
no finite models. See A. Robinson (1952).] 

2.99 By ordinary mathematical reasoning, solve the finite marriage problem. 
Given a finite set M of m men and a set N of women such that each man 
knows only a finite number of women and, for 1 < k < m, any subset 


First-Order Logic and Model Theory 119 


of M having k elements knows at least k women of N (i.e., there are at 
least k women in N who know at least one of the k given men), then it is 
possible to marry (monogamously) all the men of M to women in N so 
that every man is married to a women whom he knows. [Hint (Halmos 
and Vaughn, 1950): m = 1 is trivial. For m > 1, use induction, consider- 
ing the cases: (I) for all k with 1 < k < m, every set of k men knows at 
least k + 1 women; and (IJ) for some k with 1 < k < m, there is a set of k 
men knowing exactly k women.] Extend this result to the infinite case, 
that is, when M is infinite and well-orderable and the assumptions 
above hold for all finite k. [Hint: Construct an appropriate generalized 
theory with equality, analogous to that in Example 2 above, and use 
Corollary 2.34(a).] 


2.100 Prove that there is no generalized theory with equality K, having one 
predicate letter < in addition to =, such that the normal models of K are 
exactly those normal interpretations in which the interpretation of < is 
a well-ordering of the domain of the interpretation. 


Let .7be a wf in prenex normal form. If .7is not closed, form its closure 
instead. Suppose, for example, .7 is (Ay,)(Vy2)(Vy3)(4y4)(Ays)(VWY6) 7 (Yu Yor Vx 
Ya Ys, Ye), Where “contains no quantifiers. Erase (Sy,) and replace y, in ~ by 
a new individual constant b,: (Wy2)(Vys)(Ay)(4ys)(VY6) «(Di Yor Yor Yar Yor Yo) 
Erase (Vy,) and (Vy;), obtaining (4y4)(4y5)(VY6) (D1, Ya Ya Yar Yor Yo). Now 
erase (Ay,) and replace y, in ” by g(y2, y3), where g is a new function letter: 
(Ay5)\(VY6)~ (0, Yor Yar S(Yxr Ya), Y5r Yo)» Erase (Ay5) and replace y; by h(Yx, ys), 
where ft is another new function letter: (Wy) (D1, Yo, Yar S(Yx Ys), MY Y3)r Yo) 
Finally, erase (Vy,). The resulting wf 7(b1, Yo, Y SY V3), MYx V3), Yo) CON- 
tains no quantifiers and will be denoted by .4*. Thus, by introducing new 
function letters and individual constants, we can eliminate the quantifiers 
from a wf. 


Examples 


1. If vis (Wy)Ay2)(Vy3)(VYs)(AYs) “(Yr Yor Yar Yar Ys), Where + is quantifier- 
free, then .#* is of the form “(yy 2(Yy), Ys, Yar MY, Yar Ya))- 


2. If vis (Ay:)(Ay2)(VYs)(VY4)(AYs) “(Yr Yor Yr Yur Ys), Where is quantifier- 
free, then .#* is of the form 7(b, c, Wx, Ya, S(Y3, Ya): 


Notice that .7* - .4, since we can put the quantifiers back by applications 
of Gen and rule E4. (To be more precise, in the process of obtaining .4*, we 
drop all quantifiers and, for each existentially quantified variable y, we 
substitute a term 9(Z,, ..., Z,), where g is a new function letter and 2, ..., 2 
are the variables that were universally quantified in the prefix preceding 
(ay). If there are no such variables Z,, ..., Z,, we replace y; by a new indi- 
vidual constant.) 
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Proposition 2.35 (Second e-Theorem) 


(Rasiowa, 1956; Hilbert and Bernays, 1939) Let K be a generalized theory. 
Replace each axiom .7 of K by .4*. (The new function letters and individual 
constants introduced for one axiom are to be different from those introduced 
for another axiom.) Let K* be the generalized theory with the proper axi- 
oms .4*. Then: 


a. If 7isa wf of Kand},. 7, then Fy 7. 
b. K is consistent if and only if K* is consistent. 


Proof 


a. Let 7 be a wf of K such that },..7. Consider the ordinary theory K° 
whose axioms .%, ..., 4, are such that .4%, ...,.4* are the axioms used 
in the proof of 7. Let KO" be the theory whose axioms are 4%, ..., .4;*. 
Hence f,.7. Assume that M is a denumerable model of K°. We may 
assume that the domain of M is the set P of positive integers (see 
Exercise 2.88). Let .zbe any axiom of K°. For example, suppose that .7 
has the form (4y,)(Vy)(Vy3)(Ays) “(Y1, Yor Yx Ya), Where 7is quantifier- 
free. .7* has the form 7(}, yz, ¥3, 8(Yo, Y3)). Extend the model M step by 
step in the following way (noting that the domain always remains P); 
since vis true for M, (Ay,)(Vy.)(Vy3) (Ay4)“(Y1, Yor Ya Ya) is true for M. 
Let the interpretation b* of b be the least positive integer y, such that 
(Wy)(VY3)(AYs) “(Yr Yor Ya Ya) is true for M. Hence, (ys) (0, Yx Ya Ya) is 
true in this extended model. For any positive integers y, and 3, let 
the interpretation of g(y2, y3) be the least positive integer y, such that 
7(D, Yo, Yx, Y4) is true in the extended model. Hence, 7 (, ¥2, Y3, 2(Yx 3) 
is true in the extended model. If we do this for all the axioms .7 of K°, 
we obtain a model M* of K”. Since ,:.. 7, 7 is true for M*. Since M* 
differs from M only in having interpretations of the new individual 
constants and function letters, and since 7 does not contain any of 
those symbols, 7 is true for M. Thus, 7 is true in every denumerable 
model of K°. Hence, Fy. 7%, by Corollary 2.20(a). Since the axioms of 
K° are axioms of K, we have Fx %. (For a constructive proof of an 
equivalent result, see Hilbert and Bernays (1939).) 

b. Clearly, K* is an extension of K, since .7* + .7. Hence, if K* is consis- 
tent, so is K. Conversely, assume K is consistent. Let 7 be any wf of 
K. If K* is inconsistent, .. 7A 7 7. By (a), Fk 7A 77, contradicting the 
consistency of K. 


Let us use the term generalized completeness theorem for the proposition that 
every consistent generalized theory has a model. If we assume that every set 
can be well-ordered (or, equivalently, the axiom of choice), then the general- 
ized completeness theorem is a consequence of Proposition 2.33. 
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By the maximal ideal theorem (MI) we mean the proposition that every proper 
ideal of a Boolean algebra can be extended to a maximal ideal.* This is equiva- 
lent to the Boolean representation theorem, which states that every Boolean 
algebra is isomorphic to a Boolean algebra of sets (Compare Stone 1936). For 
the theory of Boolean algebras, see Sikorski (1960) or Mendelson (1970). The 
usual proofs of the MI theorem use the axiom of choice, but it is a remarkable 
fact that the MI theorem is equivalent to the generalized completeness theo- 
rem, and this equivalence can be proved without using the axiom of choice. 


Proposition 2.36 


(Los, 1954a; Rasiowa and Sikorski, 1951, 1952) The generalized completeness 
theorem is equivalent to the maximal ideal theorem. 


Proof 


a. Assume the generalized completeness theorem. Let B be a Boolean 
algebra. Construct a generalized theory with equality K having the 
binary function letters U and n, the singulary function letter f;' [we 
denote fi(t) by fl, predicate letters=and At, and, for each element b 
in B, an individual constant a,. By the complete description of B, we 
mean the following sentences: (i) a, # 4, if b and c are distinct ele- 
ments of B; (ii) a, U a, = a, if b, c,d are elements of B such that bUc=d 
in B; (iii) a, Na, = a, if b,c, e are elements of b such that bn c = e in B; 
and (iv) a, =a, if b and c are elements of B such that b = cin B, where b 
denotes the complement of b. As axioms of K we take a set of axioms 
for a Boolean algebra, axioms (A6) and (A7) for equality, the complete 
description of B, and axioms asserting that Ai determines a maxi- 
mal ideal (i.e., Ai(x 4X), Ai(x) A Ai(y) > Ai(x VU y), A(x) > A(x Oy), 
Ai(x) v Aq(X), and sA}(xUX)). Now K is consistent, for, if there were 
a proof in K of a contradiction, this proof would contain only a 
finite number of the symbols 4, @,, ...—say, Mp,,..., a, The elements 
b,, ..., b, generate a finite subalgebra B’ of B. Every finite Boolean 
algebra clearly has a maximal ideal. Hence, B’ is a model for the wfs 
that occur in the proof of the contradiction, and therefore the contra- 
diction is true in B’, which is impossible. Thus, K is consistent and, by 
the generalized completeness theorem, K has a model. That model 
can be contracted to a normal model of K, which is a Boolean alge- 
bra A with a maximal ideal I. Since the complete description of B is 
included in the axioms of K, B is a subalgebra of A, and then In B is 
a maximal ideal in B. 


* Since {0} is a proper ideal of a Boolean algebra, this implies (and is implied by) the proposition 
that every Boolean algebra has a maximal ideal. 
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b. Assume the maximal ideal theorem. Let K be a consistent gener- 
alized theory. For each axiom .¥ of K, form the wf .4* obtained by 
constructing a prenex normal form for .7 and then eliminating the 
quantifiers through the addition of new individual constants and 
function letters (see the example preceding the proof of Proposition 
2.35). Let K* be a new theory having the wfs .4*, plus all instances of 
tautologies, as its axioms, such that its wfs contain no quantifiers and 
its rules of inference are modus ponens and a rule of substitution 
for variables (namely, substitution of terms for variables). Now, K* is 
consistent, since the theorems of K* are also theorems of the consis- 
tent K* of Proposition 2.35. Let B be the Lindenbaum algebra deter- 
mined by K* (i.e., for any wfs and 7, let 7 Eq 7mean that Fy 7; 
Eq is an equivalence relation; let [7] be the equivalence class of 7; 
define [7] U [2] =[vv J], [7]a[Z]=lv « J], [7] =[7]; under these 
operations, the set of equivalence classes is a Boolean algebra, called 
the Lindenbaum algebra of K*). By the maximal ideal theorem, let I 
be a maximal ideal in B. Define a model M of K* having the set of 
terms of K* as its domain; the individual constants and function let- 
ters are their own interpretations, and, for any predicate letter Aj, we 
say that Aj(t,...,t,) is true in M if and only if [A}(t,...,t,)] is not 
in I. One can show easily that a wf 7 of K* is true in M if and only if 
[7] is not in I. But, for any theorem 7 of K*, [7] = 1, which is not in I. 
Hence, M is a model for K*. For any axiom 7 of K, every substitution 
instance of .7*(y, ..., y,) is a theorem in K*; therefore, .7*(y;, ..., ¥,) is 
true for all y,, ..., y, in the model. It follows easily, by reversing the 
process through which .#%* arose from .%, that .7is true in the model. 
Hence, M is a model for K. 


The maximal ideal theorem (and, therefore, also the generalized complete- 
ness theorem) turns out to be strictly weaker than the axiom of choice (see 
Halpern, 1964). 


Exercise 


2.101 Show that the generalized completeness theorem implies that every 
set can be totally ordered (and, therefore, that the axiom of choice 
holds for any set of nonempty disjoint finite sets). 


The natural algebraic structures corresponding to the propositional calcu- 
lus are Boolean algebras (see Exercise 1.60, and Rosenbloom, 1950, Chapters 
1 and 2). For first-order theories, the presence of quantifiers introduces more 
algebraic structure. For example, if K is a first-order theory, then, in the cor- 
responding Lindenbaum algebra B, [(Ax).4(x)] = 2.4], where 2, indicates 
the least upper bound in B, and ¢ ranges over all terms of K that are free 
for x in .4(x). Two types of algebraic structure have been proposed to serve 
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as algebraic counterparts of quantification theory. The first, cylindrical alge- 
bras, have been studied extensively by Tarski, Thompson, Henkin, Monk, 
and others (see Henkin et al., 1971). The other approach is the theory of poly- 
adic algebras, invented and developed by Halmos (1962). 


2.13 Elementary Equivalence: Elementary Extensions 


Two interpretations M, and M, of a generalized first-order language are said 
to be elementarily equivalent (written M, = M,) if the sentences of true for M, are 
the same as the sentences true for M). Intuitively, M, = M, if and only if M,; and 
M, cannot be distinguished by means of the language . Of course, since ” is a 
generalized first-order language, ” may have nondenumerably many symbols. 

Clearly, (1) M = M; (2) if M, = M,, then M, = M;; (3) if M; = M, and M, = M,, 
then M, = M3. 

Two models of a complete theory K must be elementarily equivalent, since 
the sentences true in these models are precisely the sentences provable in K. 
This applies, for example, to any two densely ordered sets without first or 
last elements (see page 115). 

We already know, by Proposition 2.32(b), that isomorphic models are ele- 
mentarily equivalent. The converse, however, is not true. Consider, for exam- 
ple, any complete theory K that has an infinite normal model. By Corollary 
2.34(b), K has normal models of any infinite cardinality &,. If we take two 
normal models of K of different cardinality, they are elementarily equivalent 
but not isomorphic. A concrete example is the complete theory K, of densely 
ordered sets that have neither first nor last element. The rational numbers 
and the real numbers, under their natural orderings, are elementarily equiv- 
alent nonisomorphic models of K,. 


Exercises 


2.102 Let K,,, the theory of infinite sets, consist of the pure theory K, of 
equality plus the axioms .4,, where .%, asserts that there are at least n 
elements. Show that any two models of Koo are elementarily equiva- 
lent (see Exercises 2.66 and 2.96(a)). 


2.103° If M, and M, are elementarily equivalent normal models and M, is 
finite, prove that M, and M, are isomorphic. 


2.104 Let Kbea theory with equality having &, symbols. 


a. Prove that there are at most 2** models of K, no two of which are 
elementarily equivalent. 


b. Prove that there are at most 2** mutually nonisomorphic models 
of K of cardinality &,, where y is the maximum of « and 6. 
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2.105 Let M be any infinite normal model of a theory with equality K hav- 
ing &, symbols. Prove that, for any cardinal &, > &,, there is anormal 
model M* of K of cardinality &, such that M = M’. 


A model M, of a language is said to be an extension of a model M, of 
(written M, ¢ M,)* if the following conditions hold: 


The domain D, of M, is a subset of the domain D, of M,. 

2. For any individual constant c of 7, c™? = cM, where c™ and c™ are 
the interpretations of cin M, and M,. 

3. For any function letter f;' of » and any J,,..., b 
CY Oinass BHC) Cizecs Bad: 

4. Foranypredicate letter Aj of yvandanyb,,...,b,inD,,Fy, Aj[b,..., bn] 
if and only if Fy, Aj[b1,..., bn]. 


in D,, 


n 


When M, € M,, one also says that M, is a substructure (or submodel) of M). 


Examples 


1. If y contains only the predicate letters = and <, then the set of ratio- 
nal numbers under its natural ordering is an extension of the set of 
integers under its natural ordering. 


2. If v is the language of field theory (with the predicate letter =, func- 
tion letters + and x, and individual constants 0 and 1), then the field 
of real numbers is an extension of the field of rational numbers, the 
field of rational numbers is an extension of the ring of integers, and 
the ring of integers is an extension of the “semiring” of nonnegative 
integers. For any fields F, and F,, F, € F, if and only if F, is a subfield 
of F, in the usual algebraic sense. 


Exercises 


2.106 Prove: 
a. MCM; 
b. if M, CM, and M, CM, then M, € M,; 
c. If M, CM, and M, C M,, then M, = M,. 
2.107 Assume M, C M,. 


a. Let A(x, ...,x,) be a wf of the form (Vy;) ... (WY) Ay 6 Xa Yar 0 Vids 
where vis quantifier-free. Show that, for any 0, ..., b,, in the domain 
of M,, if Fu, .2[b1,...,0,], then Fy, 4[b,,...,b,]. In particular, any 
sentence (Vy,) ... (VY) “(Yp «+ Yn), Where “is quantifier-free, is true 
in M, if it is true in M,. 


* The reader will have no occasion to confuse this use of € with that for the inclusion relation. 
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b. Let A(x, ...,x,) bea wf of the form (Ay) ... (AY) 7p «6 Xa Yor Yd 
where ~is quantifier-free. Show that, for any 0,, ..., b,, in the domain 
of M,, if Fy,.7[b1,...,b,], then Fu,.7[b,,...,b,]. In particular, any 
sentence (Sy) ... (AY,,) “(Yi -- Yn), Where vis quantifier-free, is true 
in M, if it is true in M,. 


2.108 a. Let K be the predicate calculus of the language of field theory. Find 
a model M of K and a nonempty subset X of the domain D of M 
such that there is no substructure of M having domain X. 


b. If K is a predicate calculus with no individual constants or func- 
tion letters, show that, if M is a model of K and X is a subset of the 
domain D of M, then there is one and only one substructure of M 
having domain X. 


c. Let K be any predicate calculus. Let M be any model of K and let 
X be any subset of the domain D of M. Let Y be the intersection of 
the domains of all submodels M* of M such that X is a subset of the 
domain Dy) of M*. Show that there is one and only one submodel of 
M having domain Y. (This submodel is called the submodel generated 
by X.) 


A somewhat stronger relation between interpretations than “extension” is 
useful in model theory. Let M, and M, be models of some language . We say 
that M, is an elementary extension of M, (written M, <, M,) if (1) M, € M, and 
(2) for any wf .7(y, ..., y,) of and for any b, ..., b,, in the domain D, of M,, 
Ev, 4[b1,...,6,] if and only ifFy, .7[b.,..., b,]. (In particular, for any sentence 
Zof y, zis true for M, if and only if vis true for M,.) When M, <,M,, we shall 
also say that M, is an elementary substructure (or elementary submodel) of M,. 

It is obvious that, if M, <.M,, then M, € M, and M, = M,. The converse is not 
true, as the following example shows. Let G be the elementary theory of groups 
(see page 96). G has the predicate letter =, function letter +, and individual con- 
stant 0. Let Ibe the group of integers and E the group of even integers. Then E C I 
and I = E. (The function g such that g(x) = 2x for all x in] is an isomorphism 
of I with E.) Consider the wf Ay): (A(x +x =y). Then F, [2], but not-F; .4[2]. Thus, 
Tis not an elementary extension of E. (This example shows the stronger result 
that even assuming M, € M, and M, = M, does not imply M, <,M,.) 

The following theorem provides an easy method for showing that M, <,Mb. 


Proposition 2.37 (Tarski and Vaught, 1957) 


Let M, € M,. Assume the following condition: 

($) For every wf A(x, ...,x,) of the form (Ay)7(x, ..., % y) and for all by, ...,b, in 
the domain D, of M,, if .-u, .4[b1,..., b,], then there is some d in D, such 
that FM, © [b, Cay by, d] . 

Then M, <, Mb. 
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Proof 


Let us prove: 
(*) Fm, Z[b,..., bx] if and only if Fy, 7[b1,...,b;] for any wf (x, ..., x, and 
any b,,..., b, in Dj. 

The proof is by induction on the number m of connectives and quantifiers in 7. 
If m = 0, then (*) follows from clause 4 of the definition of M, C M,. Now assume 
that (*) holds true for all wfs having fewer than m connectives and quantifiers. 


Case 1. 7 is >“. By inductive hypothesis, Fy, “[b,,...,0.] if and only if 
Ev, “[b,,...,0,]. Using the fact that not-Fy, “[b,...,b] if and only if 
Fu, 7 [b1,...,b,], and similarly for M,, we obtain (*). 


Case 2. 7 is “> .% By inductive hypothesis, Fy, “[b1,...,b¢] if and only if 
M, “[b1,..., bk] and similarly for .~ (*) then follows easily. 


Case 3. is (Ay) “(x1 ..., X,, y). By inductive hypothesis, 
(**) Eu, “[b1,..., be, d] if and only if Fu, “[01,..., bx, 4], for any by, ..., by, din Dj. 


Case 3a. Assume Fm, (Ay)/(%1,...,Xk, y)[b1,..., 0] for some by, ..., b, in Dj. 
Then Fm, “[b1,..., bx, d] for some d in D,. So, by (**), Fu, “[b1,..., dx, d]. Hence, 
Fm, (Ay)“(%1,..., Xr, Y)[B1,..., Dx]. 

Case 3b. Assume Fu, (Sy)é (1, ..., Xe, Y)[b1, ..., 0] for some Dy, ..., b, in D,. By 
assumption ($), there exists din D, such that Fy, “[b1,..., bx, d]. Hence, by (**), 
Eu, “[b1,..., bx, d] and therefore Fy, (Sy)<(x1,..., Xe, y)[b1, ..., De]. 


This completes the induction proof, since any wf is logically equivalent to 
a wf that can be built up from atomic wfs by forming negations, conditionals 
and existential quantifications. 


Exercises 


2.109 Prove: 
a. M<.M; 
b. if M, <M, and M, <, M,, then M, <.M;; 
c. ifM,<.MandM, <,M and M, € M,, then M, <, Mb. 


2.110 Let K be the theory of totally ordered sets with equality (axioms (a)- 
(c) and (e)-(g) of Exercise 2.67). Let M, and M, be the models for K 
with domains the set of positive integers and the set of nonnegative 
integers, respectively (under their natural orderings in both cases). 
Prove that M, C M, and M, ~ M,, but M, ¢.M). 


Let M be an interpretation of a language ». Extend to a language »* by 
adding a new individual constant a, for every member d of the domain of M. 
We can extend M to an interpretation of * by taking d as the interpretation 
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of a,. By the diagram of M we mean the set of all true sentences of M of the 
forms Aj (da, +++, Ga, )y 7A] (Gay, ++, Aa, ), ANA fi" (day, -.+, Aa, ) = Ad, IN particular, 
Ag, # Ag, belongs to the diagram if d, # d,. By the complete diagram of M we 
mean the set of all sentences of /* that are true for M. 

Clearly, any model M* of the complete diagram of M determines an ele- 
mentary extension M* of M,* and vice versa. 


Exercise 


2.111 a. Let M, be a denumerable normal model of an ordinary theory K 
with equality such that every element of the domain of M, is the 
interpretation of some closed term of K. 


i. Show that, if M, C M, and M, = M,, then M, <,M,. 


ii. Prove that there is a denumerable normal elementary extension 
M; of M, such that M, and M; are not isomorphic. 


b. Let K be a predicate calculus with equality having two function 
letters + and x and two individual constants 0 and 1. Let M be the 
standard model of arithmetic with domain the set of natural num- 
bers, and +, x, 0 and 1 having their ordinary meaning. Prove that 
M has a denumerable normal elementary extension that is not iso- 
morphic to M, that is, there is a denumerable nonstandard model 
of arithmetic. 


Proposition 2.38 (Upward Skolem-Lowenheim-Tarski Theorem) 


Let K be a theory with equality having &, symbols, and let M be a normal 
model of K with domain of cardinality &;. Let y be the maximum of a and f. 
Then, for any 5 > y, there is a model M* of cardinality &, such that M 4 M* 
and M <,M*. 


Proof 


Add to the complete diagram of M a set of cardinality &; of new individual 
constants b,, together with axioms b, # b, for distinct t and p and axioms 
b, # a, for all individual constants a; corresponding to members d of the 
domain of M. This new theory K* is consistent, since M can be used as a 
model for any finite number of axioms of K*. (If b..,,..., Dig, Gays -++, Ady, are the 
new individual constants in these axioms, interpret D,,,...,b,, as distinct 
elements of the domain of M different from d,, ..., d,,.) Hence, by Corollary 
2.34(a), K* has a normal model M#* of cardinality 85 such that M ¢ M#, 
M+#M#*, and M <, M*. 


* The elementary extension M* of M is obtained from M* by forgetting about the interpreta- 
tions of the a,s. 
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Proposition 2.39 (Downward Skolem—Lowenheim-—Tarski Theorem) 


Let K be a theory having &, symbols, and let M be a model of K with domain 
of cardinality &, > &,. Assume A is a subset of the domain D of M having 
cardinality n, and assume N, is such that &, > &;, > max(X,, n). Then there 
is an elementary submodel M* of M of cardinality &, and with domain D* 
including A. 


Proof 


Since n < Ns < &,, we can add &, elements of D to A to obtain a larger set 
B of cardinality &,. Consider any subset C of D having cardinality &,. For 
every wf A(Y1, .--, Vw Z) of K, and any ¢,, ..., c, in C such that Fy (4z). 4, «.., 
Yn Cy, «-- Cy] , add to C the first element d of D (with respect to some fixed 
well-ordering of D) such that Fy (4z).4[c,, ..., c,, d] . Denote the so-enlarged 
set by C*. Since K has &, symbols, there are &, wfs. Since &, < &,, there 
are at most &, new elements in C* and, therefore, the cardinality of C* is 
N;. Form by induction a sequence of sets Cy, C;, ... by setting Cy) = B and 
Crit =C*. Let D* = UnewCn Then the cardinality of D* is &,. In addition, D* 
is closed under all the functions ( f/’)“. (Assume d,, ..., d, in D*. We may 
assume d,, ..., d, in C, for some k. Now F,, GAG Cad Xe) = 2) dicate. 
Hence, (f/')“(d,,...,d,), being the first and only member d of D such that 
Et (f) (1 + Xn) = 2d, ..-, du d], must belong to Cf =Cy.1 ¢ D*.) Similarly, 
all interpretations (a)™ of individual constants are in D*. Hence, D* deter- 
mines a substructure M* of M. To show that M* <,M, consider any wf 
BY y ++ Ww Z) and any dy, ...,d,,in D* such that Fy,(4z). AY, «6 Yr 2) [dy dil 
There exists C, such that d,, ..., d,, are in C,. Let d be the first element of D 
such that F,, 4[d,, ..., d,, d]. Then d € C? =Cy41 € D*. So, by the Tarski-Vaught 
theorem (Proposition 2.37), M* <, M. 


2.14 Ultrapowers: Nonstandard Analysis 
By a filter* on a nonempty set A we mean a set 7of subsets of A such that: 
LAEZ 


2.BE FACE => BNCEF 
3.BE FABCCACGCADSCEF 


* The notion of a filter is related to that of an ideal. A subset .7 of ./(A) is a filter on A if and only 
if the set = {A — B|B € 7} of complements of sets in .7is an ideal in the Boolean algebra ./(A). 
Remember that ./(A) denotes the set of all subsets of A. 
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Examples 

Let B CA. The set .4 = {C|B CC C A} isa filter on A. .% consists of all subsets 
of A that include B. Any filter of the form .% is called a principal filter. In par- 
ticular, .4 = {A} and .% = (A) are principal filters. The filter A) is said to be 
improper and every other filter is said to be proper. 


Exercises 


2.112 Show that a filter 7 on A is proper if and only if @ ¢ .~ 


2.113 Show that a filter .“ on A is a principal filter if and only if the intersec- 
tion of all sets in .7 is a member of .* 

2.114 Prove that every finite filter is a principal filter. In particular, any filter 
on a finite set A is a principal filter. 

2.115 Let A be infinite and let .7 be the set of all subsets of A that are comple- 
ments of finite sets: = {C|(AW)C = A - W A Fin(W)}, where Fin(W) 
means that W is finite. Show that .~ is a nonprincipal filter on A. 

2.116 Assume A has cardinality &,. Let &, < &;. Let .~ be the set of all sub- 
sets of A whose complements have cardinality < &,. Show that .7 is a 
nonprincipal filter on A. 

2.117 A collection “of sets is said to have the finite intersection property if B,n 
By nN... NB, # @ for any sets B,, B,, ..., B, in @. If vis a collection of sub- 
sets of A having the finite intersection property and 7is the set of all 
finite intersections B, Nn B, nN ... N B, of sets in %, show that .7= {D|(4C) 
(BE ~AC CDC A)} isa proper filter on A. 


Definition 


A filter 7on a set A is called an ultrafilter on A if .7 is a maximal proper filter 
on A, that is, .“is a proper filter on A and there is no proper filter 7on A such 
that .7C @. 


Example 


Let d € A. The principal filter .4 = {B|d € B A B € A} is an ultrafilter on A. 
Assume that “is a filter on A such that .4¢C 7 Let C € 7-.%. Then C CA and 
d €éC. Hence,d € A-C. Thus, A-CeE .4C @ Since vis a filter and C and 
A — Care both in v, then @ = Cn (A - C) € @ Hence, vis not a proper filter. 


Exercises 
2.118 Let .7 be a proper filter on A and assume that BC A and A-B¢.% 
Prove that there is a proper filter.“ 3.7 such that Be 7” 


2.119 Let .~be a proper filter on A. Prove that .7is an ultrafilter on A if and 
only if, for every BC A, either Be yorA-Be xz 
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2.120 Let .~ be a proper filter on A. Show that .7 is an ultrafilter on A if and 
only if, for all Band Cin (A), ifB ¢.vandC¢.4thenBUCTE.% 


2.121 a. Show that every principal ultrafilter on A is of the form .4 = {B|d € 
BAB CA} for some din A. 


b. Show that a nonprincipal ultrafilter on A contains no finite sets. 


2.122 Let .7be a filter on A and let.» be the corresponding ideal: B € .» if and 
only if A — B € .% Prove that .7 is an ultrafilter on A if and only if .7 is 
a maximal ideal. 


2.123 Let X be a chain of proper filters on A, that is, for any B and C in x, 
either B C C or C CB. Prove that the union UX = {a|(4B)(B € X Aa € B)} 
is a proper filter on A, and B € UX for all B in X. 


Proposition 2.40 (Ultrafilter Theorem) 


Every proper filter on a set A can be extended to an ultrafilter on A* 


Proof 


Let .7 be a proper filter on A. Let . be the corresponding proper ideal: B € ./ 
if and only if A — B € .% By Proposition 2.36, every ideal can be extended to 
a maximal ideal. In particular, 7 can be extended to a maximal ideal » If 
we let v= {B|A- B € 7}, then 7is easily seen to be an ultrafilter and .7 € x 

Alternatively, the existence of an ultrafilter including .*can be proved easily 
on the basis of Zorn’s lemma. (In fact, consider the set X of all proper filters “ 
such that .7 € 7”. X is partially ordered by c, and any C -chain in X has an upper 
bound in X, namely, by Exercise 2.123, the union of all filters in the chain. Hence, 
by Zorn’s lemma, there is a maximal element .7* in X, which is the required 
ultrafilter.) However, Zorn’s lemma is equivalent to the axiom of choice, which is 
a stronger assumption than the generalized completeness theorem. 


Corollary 2.41 


If A is an infinite set, there exists a nonprincipal ultrafilter on A. 


Proof 


Let 7 be the filter on A consisting of all complements A — B of finite subsets 
B of A (see Exercise 2.115). By Proposition 2.40, there is an ultrafilter 72.4 
Assume “is a principal ultrafilter. By Exercise 2.121(a), 7= .4 for some d € A. 
Then A — {d} €.7¢~ Also, {d} € ~ Hence, @ = {d} n (A — {d}) € % contradicting 
the fact that an ultrafilter is proper. 


* We assume the generalized completeness theorem. 
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2.14.1 Reduced Direct Products 


We shall now study an important way of constructing models. Let K be any 
predicate calculus with equality. Let J be a nonempty set and, for each j in 
J, let M; be some normal model of K. In other words, consider a function F 
assigning to each j in J some normal model. We denote F(j) by M,. 

Let .7 be a filter on J. For each j in J, let D; denote the domain of the model 
Mj. By the Cartesian product IJ; igDj we mean the set of all functions f with 
domain J such that f(j) € D; for all j jin J. If f € Dj, we shall refer to f(j) as 
the jth component of f. Let us define a binary relation = ,in Ilj.;D; as follows: 


f=-g if and only if {j|fG)=3())} 


If we think of the sets in .7as being “large” sets, then, borrowing a phrase 
from measure theory, we read f= ,g as “f(j) = ¢(j) almost everywhere.” 

It is easy to see that = ,is an equivalence relation: (1) f = ,f; (2) if f= ¢ then 
g =f; B) if is =,g and g =,h, then f = ,h. For the proof of (3), observe that 
WIAA) =sQin list = N({)} S (LAG) = h(}. IE LAG) = 3M} and {j] g(j) = h@)} are 
in % then so is their intersection and, therefore, also {j| f(j) = h(j)}. 

On the basis of the equivalence relation = ,, we can divide IIj.;D; into 
spa classes: for any f in Il;-;D;, we define its equivalence class f, as 
{g|f =,g}. Clearly, (1) fe f; (2) f,=h, if and only if f= h; and (3) iff, #h, 
then f, N h= @. We denote the set of equivalence classes f, by I-;Dj/.~ 
Intuitively, ,.;D;/.7 is obtained from II.;D; by identifying (or merging) ele- 
ments of IT.;D; that are equal almost everywhere. 

Now we ‘shall define a model M of K with domain I, ig} |K% 


1. Let c be any individual constant of K and let c; be the interpretation 
of cin M;. Then the interpretation of cin M will be f, ons fis the 
function ‘such that f(j) = c; for all j in J. We denote f by {c}i-j. 


2. Let f;' be any function letter Of K and let Aj be any predicate letter of 
K. Their interpretations (f;’)™ and (A/’)™ are defined in the following 


manner. Let (g,) , ..., (g,)_, be any members of ITj-;Dj/~ 


a. (fe (gi) 9 « (Bn))=h,, where h(j)=(f2) (gi(f)s--1 Sal) 
fog ally ay: 


b. (A™(Si), + Bw) holds if and only if {j[Fu, 
Allgi({), --1 Sue 7. 


Intuitively, ( Hoye is calculated componentwise, and (Az) holds if and 
only if Aj holds in almost all components. Definitions (a) and (b) have 
to be shown to be independent of the choice of the representatives 
By.» &, in the Se classes (g,), ..., (@,) 4 if 81 = 7 Bip oo Sn =o Sas 
and he(j)=(fi')" (gi#(f)y -- Su*(), then @) Wh, =, h* and Gi 


{ilEM, A Lgi(j), «+2 8a} € 7 ifand only if {j[Fm, AéLg1 *(j), «1 Su *(H)I} 7. 
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Part (i) follows from the inclusion: 
{a= sto... Mf 8nlA) = gn S 
{ICA (ould SoD =(HY" iA, - 8a) 


Part (ii) follows from the inclusions: 


{ad=st DIN... AGlenD=ez(C 
{i Fu; Aglgi({), .-, Sn({)] if and only if Fy, Ablgi*(j), 8 


and 


{fl Buy APLC), Sa DL Fy, ALi), -.-- 8a(D] if and 
only if Fy, Allg (),--, stl} {Fw AflgtD, sD] 


In the case of the equality relation =, which is an abbreviation for Aj, 


(Ai)M(g,.-) if and only if {j|Fu, Ailg(j), hI} € 7 
if and only if {j| g(j)=H(j)}e.7 
if and only if g=,h 


that is, if and only if g,=h,. Hence, the interpretation (At) is the identity 
relation and the model M is normal. 

The model M just defined will be denoted Hj.Mj/7and will be called a 
reduced direct product. When .7 is an ultrafilter, TM; / Fis called an ultra- 
product. When .7 is an ultrafilter and all the M;s are the same model N, then 
Tj_Mj/7 is denoted N!/ zand is called an ultrapower. 


Examples 


1. Choose a fixed element r of the index set J, and let .7 be the prin- 
cipal ultrafilter 4 = {B|r € B A B C J}. Then, for any f, g in HjgD, 
f =, if and only if {/|f() = g()} € 4 that is, if and only if fv”) = 
g(r) . Hence, a member of T-;Dj/.7 consists of all f in Tj.;D; that 
have the same rth component. For any predicate letter Ay of 
K and any Qg,,..., g, in HjesDj, Fm Agl(g1)~ ---,(gn),-] if and 
only if {il EM, All gilj),--28u(I} €.7 that is, if and only if 

Fm, Ak[gi(j), ---, &n(j)]. Hence, it is easy to verify that the function @: 

UjD;/7— D,, defined by 9(g) = g(7) is an isomorphism of Ij.Mj/.7 

with M,. Thus, when .7is a principal ultrafilter, the ultraproduct 

Ij_M,j/7is essentially the same as one of its components and yields 

nothing new. 
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2. Let .7be the filter {J}. Then, for any f, g in I-jD;, f =. if and only 
if {| fG) = g()} € 4 that is, if and only if f(/) = @(j) ) for all j in J, or 
if and only if f = g. Thus, every member of] I, a) /7is a meray 
{g} for some g in ID, Moreover, (fi) Mg), -(8n)7) = 18h, 
where g is such that g(j) =(fr’) i (e1(/) )y-eer Sn(f)) ie all j in J. Also, 

Fw AR[(G1) -7 ---r (Gn) if and only if Fu, AfLgi({), «-. 8u(/)] for alll j 

in J. Hence, Tl, eM, / 7 is, in this case, essentially the same as the ordi- 

nary “direct product” T_Mj, in which the operations and relations 
are defined componentwise. 


3. Let .7 be the improper filter .7(J). Then, for any f, g in pote is 2 
if and only if {j|fG) = g()} € .% that is, if and only if {j|f(/) = g{)} € 
(J). Thus, f =,g for all f and g, and Il,.,D,/7 consists of ms = 

element. For any Soe letter Ay, Fu AkLf-, ..., f-] if and only if 


{ju AclLf(), ---, f)]} ¢ PU); that is, every atomic wf is true. 


The basic theorem on ultraproducts is due to Los (1955b). 


Proposition 2.42 (Los’s Theorem) 


Let .7 be an ultrafilter on a set J and let M =ITj.;M;/.7 be an ultraproduct. 


a. Let s = ((g;) , (2), --.) be a denumerable sequence of elements of 

TjeDj/ For each j in J, let s; be the denumerable sequence (¢;(j), 

82(j), ---) in D;. Then, for any we 2 of K, s satisfies .7in M if and only 
if {j|s) satisfies zinMje.% 


b. Forany sentence of K, zis true inI,.M)/7 ifand only if j Fu; 7 €.7. 
(Thus, (b) asserts that a sentence .7 is true in an ultraproduct if and 
only if it is true in almost all components.) 


Proof 


a. We shall use induction on the number m of connectives and quanti- 
fiers in .z. We can reduce the case m = 0 to the following subcases*: 
(i) Ag (Xi, «+, Xi, )7 (li) Xe = fe’ (Xin, ---, Xj, ); and (iii) x, = a,. For subcase 
(i), s satisfies Aj (xj, ..., X;,) if and only if Fy Agl(gi)-, ---, (Sid) 
which is equivalent to {j| Fu, ArlgaQ),-- 8, l}e.7; that is 


{ j|s; satisfies Ag (Xi, ..., Xi, )in M;} e F. Subcases (ii) and (iii) are han- 
dled in similar fashion. 


* Awf Ag(t,...,f,) can be replaced by (Vi))...(Vun (uy = th A... AUn = tn > Al (U1,..-,Un)), and a 
wi x= fi'(h,...,f,) can be replaced by (VZ)...(VZn)(Z1 =H A.A Zn = ty X= fl (Z,..-,Zn)). In 
this way, every wf is equivalent to a wf built up from wfs of the forms (i)—(iii) by applying 
connectives and quantifiers. 
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Now, let us assume the result holds for all wfs that have fewer than 
m connectives and quantifiers. 


Case 1. vis 7. By inductive hypothesis, s satisfies “in M if and only 
if {j|s;satisfies ~in M}} € .% s satisfies >” in M if and only if {j|s;sat- 
isfies yin M} ¢ .% But, since .7is an ultrafilter,the last condition is 
equivalent, by exercise 2.119, to {j|s; satisfies avin M} ex 


Case 2. vis “A Y. By inductive hypothesis, s satisfies ~ in M if and 
only if {j|s;satisfies ~in M;} € 4 and s satisfies 7 in M if and only if 
{j|s; satisfies 7 in Mj} € .~ Therefore, s satisfies » A 7 if and only 
if both of the indicated sets belong to .% But, this is equivalent to their 
intersection belonging to .% which, in turn, is equivalent to { /|s;satisfies 
eA GinM} €.% 


Case 3. Zis (Ax)~ Assume s satisfies (Ax,)7. Then there exists 1 in 
je; such that s’ satisfies “in M, where s’ is the same as s except that 
h_,is the ith component of s’. By inductive hypothesis, s’ satisfies 7 in 
M if and only if {j|s;’ satisfies in M;} € .~ Hence, {j|s;satisfies(Ax;) 
in M} € .% since, if s;’ satisfies ~in M, then s; satisfies (4x) in M,. 
Conversely, assume W = {j|s;satisfies (Ax;)- in M}} € .% For each j in 

W, choose some s; such that s,’ is the same as s; except in at most the 
ith component and s;’ satisfies . Now define h in Hj_;D; as follows: 
for j in W, let h(j) be the ith component of s;, and, for j ¢ W, choose h()) 
to be an arbitrary element of D;. Let s’” be the same as s except that its 
ith component is h.. Then W c{j|s; satisfies 7 in M;}¢.7. Hence, 
by the inductive hypothesis, s” satisfies ~in M. Therefore, s satisfies 
(Ax) 7in M. 

b. This follows from part (a) by noting that a sentence .7 is true in a 
model if and only if some sequence satisfies .7. 


Corollary 2.43 


If M is a model and .7is an ultrafilter on J, and if M* is the ultrapower M//.7 
then M* = M. 


Proof 


Let .4 be any sentence. Then, by Proposition 2.42(b), .7 is true in M* if and 
only if {j|.7is true in M} € .~ If vis true in M, {j|.7is true in M} =J €.4 If 4 
is false in M, {j|.7is truein M}=@ ¢ .% 

Corollary 2.43 can be strengthened considerably. For each c in the domain 
D of M, let c* stand for the constant function such that c*(j) = c for all j in J. 
Define the function wy such that, for each c in D, wo) = (c#) -€ D!/ 7 and denote 
the range of y by M*. M# obviously contains the interpretations in M* of the 
individual constants. Moreover, M* is closed under the operations (f;’ )’: for 
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(fi')™ (ch) -, «+, (ch)) is hb, where h(j)=(fr)M(c1, ..., Cn) for all j in J, and 
Gp aw (ome te) is a fixed element b of D. So, h.,= (b*) © M*. Thus, M* is a 
Substructire of M*. 


Corollary 2.44 


yw is an isomorphism of M with M*, and M# <,M*. 


Proof 


a. By definition of M*, the range of w is M?. 

b. w is one-one. (For any c, d in D, (c*) = (*) if and only if c# = ,d*, 
which is equivalent to {j|c*(j) = d*(j)} € .7; that is, {j|c = d} € .~ If 
c#d, {j|c=d}=@¢, and, therefore, w (0) # w (d). 

.For any cy .., C, in D, (f€)” (wa) ..., Wen) =U)" (a), 
(ch),)=h,, where h(j)=(f')™(cq(), ---, AG) = (FE (Cr «2 Cn)» 
Thus, hh, =((fe'yM (C1, «00, Cn))P7-F = Wf (Cr, «227 Cn)) « 

d. Fue Ag lw(c1) eae: Wn ) ifandonlyif{j| -M Ag(y(cr)(), sey wae 4 

which is equivalent to {il Em Ag (C1, ..., cn) e7, that is, 

Fu Alcs, ..., ¢,]. Thus, y is an isomorphism of M with M*. 


io) 


Tosee that M* < .M* let.7beany wfand(c?) -, ..., (cn), -€ M* . Then, by proposi- 
tion 2.42(a), Fue.4[(ct).-, ..., (ch) -] if and only if {j| Fu .2[ci(j), . AGE % 
which is equivalent to {j| Fy Icy ..., C,]} € .4 which, in turn, is equivalent to 
Fy cy... Cy], that is, toK ye [(ci)r, ..., (ch) -], since w is an isomorphism of 
M with Me. 


Exercises 


2.124 (The compactness theorem again; see Exercise 2.54) If all finite subsets 
of a set of sentences [ have a model, then T has a model. 


2.125 a. Aclass 7 of interpretations of a language ~ is called elementary if 
there is a set T of sentences of such that 7 is the class of all mod- 
els of I. Prove that 7 is elementary if and only if 7 is closed under 
elementary equivalence and the formation of ultraproducts. 


b. Aclass 7 of interpretations of a language » will be called sentential 
if there is a sentence .vof such that 7 is the class of all models of 
2. Prove that a class 7 is sentential if and only if both 7 and its 
complement 7 (all interpretations of ~ not in 7) are closed with 
respect to elementary equivalence and ultraproducts. 


c. Prove that theory K of fields of characteristic 0 (see page 116) is 
axiomatizable but not finitely axiomatizable. 
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2.14.2 Nonstandard Analysis 


From the invention of the calculus until relatively recent times the idea of 
infinitesimals has been an intuitively meaningful tool for finding new results 
in analysis. The fact that there was no rigorous foundation for infinitesimals 
was a source of embarrassment and led mathematicians to discard them in 
favor of the rigorous limit ideas of Cauchy and Weierstrass. However, almost 
fifty years ago, Abraham Robinson discovered that it was possible to res- 
urrect infinitesimals in an entirely legitimate and precise way. This can be 
done by constructing models that are elementarily equivalent to, but not iso- 
morphic to, the ordered field of real numbers. Such models can be produced 
either by using Proposition 2.33 or as ultrapowers. We shall sketch here the 
method based on ultrapowers. 

Let R be the set of real numbers. Let K be a generalized predicate calculus 
with equality having the following symbols: 


1. For each real number 1, there is an individual constant a,. 
2. For every n-ary operation ¢ on R, there is a function letter f,. 
3. For every n-ary relation ® on R, there is a predicate letter Aj. 


We can think of R as forming the domain of a model 7 for K; we simply let 
(a) “=r, (f,)’=@, and (Aq) “= ©. 

Let .7be a nonprincipal ultrafilter on the set @ of natural numbers. We can 
then form the ultrapower * = 7°/ 4 We denote the domain R°/ 7of #* by R*. 
By Corollary 2.43, 7* = vand, therefore, * has all the properties formaliz- 
able in K that ~possesses. Moreover, by Corollary 2.44, 7* has an elementary 
submodel .7* that is an isomorphic image of .7. The domain R* of .7* consists 
of all elements (c*) , corresponding to the constant functions c*(i) = c for alli in 
w. We shall sometimes refer to the members of R* also as real numbers; the 
elements of R* — R* will be called nonstandard reals. 

That there exist nonstandard reals can be shown by explicitly exhibiting 
one. Let 1() = j for all j in w. Then 1 ~€ R*. However, (c*) -<1 ,for all c in R, by 
virtue of Los’s theorem and the fact that {j|c*(j) < «j)} = {j|c < j}, being the set 
of all natural numbers greater than a fixed real number, is the complement 
of a finite set and is, therefore, in the nonprincipal ultrafilter .~1 is an “infi- 
nitely large” nonstandard real. (The relation < used in the assertion (c*) -< 1, 
is the relation on the ultrapower * corresponding to the predicate letter < of 
K. We use the symbol < instead of (<)*" in order to avoid excessive notation, 
and we shall often do the same with other relations and functions, such as 
u+v,uxv,and |u|.) 

Since * possesses all the properties of 7 formalizable in K, * is an 
ordered field having the real number field _7* as a proper subfield. (* is non- 
Archimedean: the element 1, defined above is greater than all the natural 
numbers (n*) , of .7*.) Let R,, the set of “finite” elements of R*, contain those 
elements z such that |z| < u for some real number u in R*. (R, is easily seen 
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to form a subring of R*.) Let Ry consist of 0 and the “infinitesimals” of R*, that 
is, those elements z # 0 such that |z| <u for all positive real numbers u in R*. 
The reciprocal 1/1 ,is an infinitesimal.) It is not difficult to verify that Ry is an 
ideal in the ring R,. In fact, since x € R, — Ry implies that 1/x € R, — Ry, it can 
be easily proved that Ry is a maximal ideal in R,. 


Exercises 


2.126 Prove that the cardinality of R* is 2°. 

2.127 Prove that the set R, is closed under the operations of +, —, and x. 
2.128 Prove that, if x € R, and y € Ro, then xy € Ro. 

2.129 Prove that, if x € R, — Ry, then 1/x € R, — Ro. 


Let x € R,. Let A = {u|u € R* Au <x} and B= {u|u € R* A u > x}. Then (A, B) 
isa “cut” and, therefore, determines a unique real number r such that (1) (Vx) 
(x €A>x <7) and (2) (V(x € B> x > 1r)* The difference x — r is 0 or an 
infinitesimal. (Proof: Assume x — r is not 0 or an infinitesimal. Then |x — r| 
> r, for some positive real number 1. If x >7, thenx-r>r,.Sox>rt+rn>17 
But then r + 1, € A, contradicting condition (1). If x <7, then r — x > 1r,, and so 
r>r-—1r,>x. Thus, r- 1, € B, contradicting condition (2).) The real number 
r such that x — r is 0 or an infinitesimal is called the standard part of x and is 
denoted st(x). Note that, if x is itself a real number, then st(x) = x. We shall use 
the notation x © y to mean st(x) = st(y). Clearly, x ~ y if and only if x — yis 0 or 
an infinitesimal. If x ~ y, we say that x and y are infinitely close. 


Exercises 


2.130 If x € R,, show that there is a unique real number r such that x — ris 0 
or an infinitesimal. (It is necessary to check this to ensure that st(x) is 
well-defined.) 


2.131 If x and y are in R,, prove the following. 
a. st(x + y) = st(x) + st(y) 
b. st(xy) = st(x)st(y) 
c. st(—x) = -st(x) A st(y — x) = st(y) — st(x) 
d. x >0 = st(x) >0 
e. x<y> st) <st(y) 
The set of natural numbers is a subset of the real numbers. Therefore, in the 


theory K there is a predicate letter N corresponding to the property x € . 
Hence, in R*, there is a set w* of elements satisfying the wf N(x). An element 


* See Mendelson (1973, Chapter 5). 
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fof R* satisfies N(x) if and only if {j|f(j) € @} € .4 In particular, the elements 
n*., for n € w, are the “standard” members of w*, whereas 1 , for example, is a 
“nonstandard” natural number in R*. 

Many of the properties of the real number system can be studied from the 
viewpoint of nonstandard analysis. For example, if s is an ordinary denu- 
merable sequence of real numbers and c is a real number, one ordinarily says 
that lim s,, = c if 


(&) (Ve)(e>0=> Gn)(n e@a(Vk)(k e@nk =n =>|%—c|<€))) 


Since s € Ro, s is a relation and, therefore, the theory K contains a predicate 
letter S(n, x) corresponding to the relation s,, = x. Hence, R* will have a rela- 
tion of all pairs (n, x) satisfying S(n, x). Since v* = », this relation will be a 
function that is an extension of the given sequence to the larger domain ’*. 
Then we have the following result. 


Proposition 2.45 


Let s be a denumerable sequence of real numbers and c a real number. Let s* 
denote the function from o* into R* corresponding to s in .7*. Then lim s,, =c 
if and only if s*(n) = c for all n in w* — . (The latter condition can be para- 
phrased by saying that s*(n) is infinitely close to c when n is infinitely large.) 


Proof 


Assume lim s,, = c. Consider any positive real e. By (&), there is a natural 
number ny such that (VA(k Ew Ak >) => |s,-c| <e) holds in » Hence, the 
corresponding sentence (VA)(k € w* Ak > 1) > |s*(k) — c| <e) holds in . For 
any 1 in o* — , n > ny and, therefore, |s*(n) — c| < e. Since this holds for all 
positive real e, s*(1) — c is 0 or an infinitesimal. 

Conversely, assume s*(1) = c for all n ¢ w* — w. Take any positive real e. Fix 
some n, in w* — w. Then (vk)\(k > 1, > |s*(k) —c| <). So the sentence (An)(n € @ 
A (Wh(kE@Ak>n= |s,-c| <e)) is true for “ and, therefore, also for 7. So 
there must be a natural number n, such that (VA)\(k Ea AK > 1) => |s,-c] <e). 
Since € was an arbitrary positive real number, we have proved lim s,, = c. 


Exercise 


2.132 Using Proposition 2.45, prove the following limit theorems for the real 
number system. If s and u are denumerable sequences of real numbers 
and c, and c, are real numbers such that lim s,, = c, and lim u,, = c,, then: 


a. lim (s, + u,,) = Cc, + Co; 
b. lim (s,,U,,) = C1C>; 
c. Ifc,#0and all u, #0, lim (s,/u,) = ¢,/co. 
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Let us now consider another important notion of analysis, continuity. Let B 
be a set of real numbers, let c € B, and let f be a function defined on B and 
taking real values. One says that fis continuous at c if 


(1) (Ve)(e > 0 = (48)(6 > OA(Vx)(x € BA| x-c |< 6 >| f(x)- f(c) |k €))) 


Proposition 2.46 


Let f be a real-valued function on a set B of real numbers. Let c € B. Let B* be 
the subset of R* corresponding to B, and let f* be the function corresponding 
to ft Then fis continuous at c if and only if (Vx)(x € Bt Ax xc => f*(x) & fo). 


Exercises 


2.133 Prove Proposition 2.46. 


2.134 Assume f and g are real-valued functions defined on a set B of real 
numbers and assume that f and g are continuous at a point c in B. 
Using Proposition 2.46, prove the following. 


a. f + g is continuous at c. 
b. f- g is continuous at c. 


2.135 Let fbe a real-valued function defined on a set B of real numbers and 
continuous at a point cin B, and let g be a real-valued function defined 
on a set A of real numbers containing the image of B under f, Assume 
that g is continuous at the point f(c). Prove, by Proposition 2.46, that 
the composition g o f is continuous at c. 


2.136 Let CCR. 


a. C is said to be closed if (Vx)((Ve)[e > 0 > (Ay)(y € C Alx - y| <a] > 
x € C). Show that C is closed if and only if every real number that is 
infinitely close to a member of C* is in C. 


b. C is said to be open if (Vx\(x € C > (A856 > 0A (Vy\(ly - x| <b > 
y € C))). Show that C is open if and only if every nonstandard real 
number that is infinitely close to a member of C is a member of C*. 


Many standard theorems of analysis turn out to have much simpler proofs 
within nonstandard analysis. Even stronger results can be obtained by start- 
ing witha theory K that has symbols, not only for the elements, operations and 
relations on R, but also for sets of subsets of R, sets of sets of subsets of R, and 


* To be more precise, f is represented in the theory K by a predicate letter A, where A(x, y) 
corresponds to the relation f(x) = y. Then the corresponding relation Aj in R* determines a 
function f* with domain B*. 
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so on. In this way, the methods of nonstandard analysis can be applied to all 
areas of modern analysis, sometimes with original and striking results. For fur- 
ther development and applications, see A. Robinson (1966), Luxemburg (1969), 
Bernstein (1973), Stroyan and Luxemburg (1976), and Davis (1977a). A calculus 
textbook based on nonstandard analysis has been written by Keisler (1976) and 
has been used in some experimental undergraduate courses. 


Exercises 


2.137 A real-valued function f defined on a closed interval [a, b] = {xjJa<x< 
b} is said to be uniformly continuous if 


(Ve)(e > 0 => (A8)(6>0A(Vx)(Vy)(asx<sbaasy<ba|x-y|<d 
=| fx)- fF) I<) 


Prove that f is uniformly continuous if and only if, for all x and y in 
la, bl x wy > fir) & PY). 

2.138 Prove by nonstandard methods that any function continuous on [a, b] 
is uniformly continuous on [a, b]. 


2.15 Semantic Trees 


Remember that a wf is logically valid if and only if it is true for all interpre- 
tations. Since there are uncountably many interpretations, there is no sim- 
ple direct way to determine logical validity. Gédel’s completeness theorem 
(Corollary 2.19) showed that logical validity is equivalent to derivability in 
a predicate calculus. But, to find out whether a wf is provable in a predicate 
calculus, we have only a very clumsy method that is not always applicable: 
start generating the theorems and watch to see whether the given wf ever 
appears. Our aim here is to outline a more intuitive and usable approach in 
the case of wfs without function letters. Throughout this section, we assume 
that no function letters occur in our wfs. 

A wf is logically valid if and only if its negation is not satisfiable. We shall 
now explain a simple procedure for trying to determine satisfiability of a 
closed wf .4* Our purpose is either to show that .7is not satisfiable or to find 
a model for 4. 

We shall construct a figure in the shape of an inverted tree. Start with the 
wf .7 at the top (the “root” of the tree). We apply certain rules for writing 


* Remember that a wf is logically valid if and only if its closure is logically valid. So it suffices 
to consider only closed wfs. 
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wfs below those already obtained. These rules replace complicated wfs by 
simpler ones in a way that corresponds to the meaning of the connectives 
and quantifiers. 


me Aev9) A¢>7) Axe -(Ax)z 


1 L 1 1 1 
7 az Z (Ax)nv (Vx)rz 
170 WD 
Negation: 
ACAD) Ae SZ) 
“oN “™N 
1 7AZD G AG 
ZAG 
oN GI 


Conjunction: Disjunction: “ \, 
6 


Co D 
7, 
COD 
CDT y. 
Conditional: ,4 \,_ Biconditional: 
C 1G 
ak Z, 
Yo AG 
Vv 
. on Cvaye ©) Rule U) [Here, b is any individual constant 
Universal quantifier: 1 
already present.] 
7 (0) 
(Ax)z (x) [c is a new individual 
Existential quantifier: + constant not already in 


#(c) _ the figure.] 


Note that some of the rules require a fork or branching. This occurs when the 
given wf implies that one of two possible situations holds. 

A branch is a sequence of wfs starting at the top and proceeding down the fig- 
ure by applications of the rules. When a wf and its negation appear in a branch, 
that branch becomes closed and no further rules need be applied to the wf at the 
end of the branch. Closure of a branch will be indicated by a large cross x. 

Inspection of the rules shows that, when a rule is applied to a wf, the useful- 
ness of that wf has been exhausted (the formula will be said to be discharged) 
and that formula need never be subject to a rule again, except in the case of a 
universally quantified wf. In the latter case, whenever a new individual constant 
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appears in a branch below the wf, rule U can be applied with that new constant. 
In addition, if no further rule applications are possible along a branch and no individual 
constant occurs in that branch, then we must introduce a new individual constant for 
use in possible applications of rule U along that branch. (The idea behind this require- 
ment is that, if we are trying to build a model, we must introduce a symbol for at 
least one object that can belong to the domain of the model.) 


2.15.1 Basic Principle of Semantic Trees 


If all branches become closed, the original wf is unsatisfiable. If, however, a 
branch remains unclosed, that branch can be used to construct a model in 
which the original wf is true; the domain of the model consists of the indi- 
vidual constants that appear in that branch. 

We shall discuss the justification of this principle later on. First, we shall 
give examples of its use. 


Examples 


1. To prove that (Vx)7 (x) > ~(b) is logically valid, we build a semantic tree 
starting from its negation. 


i. (Wx) (x) > 7 (b) 


ii, (Wx)7 (x) (i 
iii, 37 (b) (i 
iv. 7(b) (ii) 


x 
The number to the right of a given wf indicates the number of the line 
of the wf from which the given wf is derived. Since the only branch in 
this tree is closed, =((Vx)7(x) > 7(b)) is unsatisfiable and, therefore, (Vx) 
7(x) > 7(b) is logically valid. 
2 i. AlVA(7@~) > 2X) > (W727) = (VX) 70) 


ii. (Wx)(7(x) > 7(0) (i) 
iii, 7((Vx)7 (x) > (Wx) (0) (i) 
iv. (Vx)7(x) (iii) 
v. =(Vx) 7(x) (iii) 
vi. (Ax) 37(x) (v) 
vii. 77(b) (vi) 

viii. 7(b) (iv) 
ix. 7(b)=> 7(b) (ii) 
x. av(b) —7(b) (ix) 


x x 
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3. 


5. 
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Since both branches are closed, the original wf (i) is unsatisfiable and, 
therefore, (Vx)(7 (0) > A(X) > (VxX)7 (&) => (Vx). 70) is logically valid. 


i. (Ax) Ai(x) = (Vx)Ai(x)] 

ii, (Ax)Ai(x) (i) 
iii. “(Vx)Al(x) (i) 
iv. Aj(b) (ii) 
v. (Ax) Al (x) (iii) 
vi. —Ai(c) (v) 


No further applications of rules are possible and there is still an open 
branch. Define a model M with domain {b, c} such that the interpretation 
of Aj holds for b but not for c. Thus, (4x) Ai (x) is true in M but (Vx)At(x) 
is false in M. Hence, (Ax) Aj(x) => (Vx)Ai(x) is false in M and is, therefore, 
not logically valid. 


i. A[AY)(Vx). Ax, y) > Way) Ax Y)] 

ii, (Ayla) Ax, ) (i 
iii. “(Vx)(Ay) AX, Y) (i) 
iv. (Vx).A(x, b) (ii) 

v. (Ax)>Gy) Ax y) (iii) 
vi. .2(b, b) (iv) 
vii. (Ay). A, y) (v) 
viii. 7c, b) (iv) 
ix. (VY) Y) (vii) 

x. 14 (c, b) (ix) 


Hence, (Ay)(Vx).A(x, y) > (Vx)(Ay).A%, y) is logically valid. 

Notice that, in the last tree, step (vi) served no purpose but was required 
by our method of constructing trees. We should be a little more precise 
in describing that method. At each step, we apply the appropriate rule 
to each undischarged wf (except universally quantified wfs), starting 
from the top of the tree. Then, to every universally quantified wf on a 
given branch we apply rule U with every individual constant that has 
appeared on that branch since the last step. In every application of a 
rule to a given wf, we write the resulting wf(s) below the branch that 
contains that wf. 


i. =[(Vx). Ax) > (Ax). A(X)] 
ii. (Vx). A(x) (i) 
iii. 7(Ax). A(x) (i) 


iv. (Vx). A(x) (iii) 
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v. Alb) (ii)* 


vi. a Z(b) (iv) 


Hence, (Vx).A(x) > (4x). A(x) is logically valid. 
6. i. A(Vx)AAT(x,x) > Ax)(Vy)HAr(x, y)] 


ii. (Vx)-A3(x, x) i 
iii, *(Sx)(Vy) R(x, 9) (ii) 
iv. (Vx)Vy)A2(>x, y) (iii) 
v. AAr (a, a) a 
vi. A(Vy)AA?(a1,y) (iv) 
vii. (Ay)-7A?(a,y) (vi) 
viii. —AA7 (4,4) (vii) 
ix. A?(a1,a2) (viii) 
xX. AAT (ay, a2) (ii) 
xi. A(Vy)A AT (a2, y) (iv) 
xii. (Sy) AAP (ap, y) (xi) 
xiii. —A?(a2,43) (xii) 
xiv. A?(a,,a3) (xiii) 


We can see that the branch will never end and that we will obtain a sequence 
of constants a, 45, ... With wfs A?(a,,4n41) and 4A?(a,,,4,). Thus, we construct 
a model M with domain {a,, a,, ...} and we define (A7)™ to contain only the 
pairs (a,, 4,,,;). Then, (Vx)4A?(x,x) is true in M, whereas (Ax)(Vy)5A7(x, y) is 
false in M. Hence, (Vx)AAi(x,x) > (Ax)(Vy)AA?(x,y) is not logically valid. 


Exercises 


2.139 Use semantic trees to determine whether the following wfs are logi- 
cally valid. 


a. (Wx)(Ai(x) v A(x) > ((Vx)Ai(x)) v (Vx)A2(x) 
b. (Wx). A(x) A (Wx) 7 (x) > (WxX)( A) A 7 (2) 
c. (Wx) A(x) A 7 (x0) > (Vx). Ad) A (WX) 7 (x) 


* Here, we must introduce a new individual constant for use with rule U since, otherwise, the 
branch would end and would not contain any individual constants. 

+ Here, we must introduce a new individual constant for use with rule U since, otherwise, the 
branch would end and would not contain any individual constants. 
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- (Ax)(Ai(x) => A2(x)) = (x) A(x) = (Ax) A2(x)) 
(Ax)(4y)Ar(x, y) > (2)AT(z,2) 

(Vx) Ai(x)) v (Wx) A2(x) => (x)(At(x) v A2(x)) 
(Ax)(Ay)(Ai(x,y) = (Vz) AT(z,y)) 

. The wfs of Exercises 2.24, 2.31(a, e, j), 2.39, and 2.40. 
i. The wfs of Exercise 2.21(a, b, g). 

j. (Wx)(Ai(x) => Ar(x)) > Vx)(Al(x) => 3A3(x)) 


so mo o 


Proposition 2.47 


Assume that I is a set of closed wfs that satisfy the following closure condi- 
tions: (a) if >. 7is in, then .7is in T; (b) if =(4v 7) is inT, then = and 7/are 
inl; ©) if -(47> “7 isin[, then vand -/are inT; (d) if (Vx). 7is in T, then (4x) 
ais inT; (e) if -(Ax).7is in T, then (Vx) —7is in T; (f) if 7(4A 7) is inT, then at 
least one of = vand 7, is in T; (g) if -(7@ 7) is in T, then either vand —~ are 
in Tj or > Zand are in T; (h) if 4A vis in T, then so are “and 7; (i) if ZV 7 
is in [, then at least one of .zand vis in [, (j) if.7 > vis in, then at least one 
of = vand vis in T; (k) if. 7@ vis in, then either wand “are inT or > 7and 
av are in T; (l) if Vx).7@) is in, then .A(b) is in T (where b is any individual 
constant that occurs in some wf of I); (m) if (Ax).a(x) is in, then .A() is in T 
for some individual constant b. If no wf and its negation both belong to Fr and 
some wfs in I contain individual constants, then there is a model for [ whose 
domain is the set D of individual constants that occur in wfs of T° 


Proof 


Define a model M with domain D by specifying that the interpretation 
of any predicate letter A; in I contains an n-tuple (b,, ..., b,) if and only if 
Ag(b,, ..., b,) is in Tl. By induction on the number of connectives and quanti- 
fiers in any closed wf <, it is easy to prove: (i) if “is in I, then «is true in M; 
and (ii) if sis inT then is false in M. Hence, M is a model for I: 

If a branch of a semantic tree remains open, the set I’ of wfs of that branch 
satisfies the hypotheses of Proposition 2.47. If follows that, if a branch of a 
semantic tree remains open, then the set I’ of wfs of that branch has a model 
M whose domain is the set of individual constants that appear in that branch. 
This yields half of the basic principle of semantic trees. 


Proposition 2.48 


If all the branches of a semantic tree are closed, then the wf .#at the root of 
the tree is unsatisfiable. 
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Proof 


From the derivation rules it is clear that, if a sequence of wfs starts at .zand 
continues down the tree through the applications of the rules, and if the 
wfs in that sequence are simultaneously satisfiable in some model M, then 
that sequence can be extended by another application of a rule so that the 
added wf{(s) would also be true in M. Otherwise, the sequence would form 
an unclosed branch, contrary to our hypothesis. Assume now that .7is sat- 
isfiable in a model M. Then, starting with .4 we could construct an infi- 
nite branch in which all the wfs are true in M. (In the case of a branching 
rule, if there are two ways to extend the sequence, we choose the left-hand 
wf.) Therefore, the branch would not be closed, contrary to our hypothesis. 
Hence, .#is unsatisfiable. 

This completes the proof of the basic principle of semantic trees. Notice 
that this principle does not yield a decision procedure for logical validity. If 
a closed wf .7is not logically valid, the semantic tree of =.7 may (and often 
does) contain an infinite unclosed branch. At any stage of the construction of 
this tree, we have no general procedure for deciding whether or not, at some 
later stage, all branches of the tree will have become closed. Thus, we have 
no general way of knowing whether .7is unsatisfiable. 

For the sake of brevity, our exposition has been loose and imprecise. 
A clear and masterful study of semantic trees and related matters can be 
found in Smullyan (1968). 


2.16 Quantification Theory Allowing Empty Domains 


Our definition in Section 2.2 of interpretations of a language assumed that 
the domain of an interpretation is nonempty. This was done for the sake of 
simplicity. If we allow the empty domain, questions arise as to the right way 
of defining the truth of a formula in such a domain.* Once that is decided, 
the corresponding class of valid formulas (that is, formulas true in all inter- 
pretations, including the one with an empty domain) becomes smaller, and 
it is difficult to find an axiom system that will have all such formulas as its 
theorems. Finally, an interpretation with an empty domain has little or no 
importance in applications of logic. 

Nevertheless, the problem of finding a suitable treatment of such a more 
inclusive logic has aroused some curiosity and we shall present one possible 
approach. In order to do so, we shall have to restrict the scope of the investi- 
gation in the following ways. 


* For example, should a formula of the form (Vx)(Aj(x) A4Ai(x)) be considered true in the 
empty domain? 
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First, our languages will contain no individual constants or function let- 
ters. The reason for this restriction is that it is not clear how to interpret indi- 
vidual constants or function letters when the domain of the interpretation is 
empty. Moreover, in first-order theories with equality, individual constants 
and function letters always can be replaced by new predicate letters, together 
with suitable axioms.* 

Second, we shall take every formula of the form (Vx).7(x) to be true in the 
empty domain. This is based on parallelism with the case of a nonempty 
domain. To say that (Vx).4(x) holds in a nonempty domain D amounts to 
asserting 


(*) for any object c, ifceD, then B(c) 


When D is empty, “c € D” is false and, therefore, “if c € D, then .A(c)” is true. 
Since this holds for arbitrary c, (*) is true in the empty domain D, that is, (Vx) 
A(x) is true in an empty domain. Not unexpectedly, (4x).~(x) will be false in 
an empty domain, since (4x).4(x) is equivalent to 7(Vx)7.A(x). 

These two conventions enable us to calculate the truth value of any closed 
formula in an empty domain. Every such formula is a truth-functional com- 
bination of formulas of the form (Vx).4(x). Replace every subformula (Vx). A(x) 
by the truth value T and then compute the truth value of the whole formula. 

It is not clear how we should define the truth value in the empty domain of 
a formula containing free variables. We might imitate what we do in the case 
of nonempty domains and take such a formula to have the same truth values 
as its universal closure. Since the universal closure is automatically true in the 
empty domain, this would have the uncomfortable consequence of declaring 
the formula A}(x) \—Aj(x) to be true in the empty domain. For this reason, we 
shall confine our attention to sentences, that is, formulas without free variables. 

A sentence will be said to be inclusively valid if it is true in all interpreta- 
tions, including the interpretation with an empty domain. Every inclusively 
valid sentence is logically valid, but the converse does not hold. To see this, 
let f stand for a sentence 7A =~, where is some fixed sentence. Now, f is false 
in the empty domain but (Vx)f is true in the empty domain (since it begins 
with a universal quantifier). Thus the sentence (Vx)f > f is false in the empty 
domain and, therefore, not inclusively valid. However, it is logically valid, 
since every formula of the form (Vx).7=> .7is logically valid. 

The problem of determining the inclusive validity of a sentence is reduc- 
ible to that of determining its logical validity, since we know how to deter- 
mine whether a sentence is true in the empty domain. Since the problem of 
determining logical validity will turn out to be unsolvable (by Proposition 
3.54), the same applies to inclusive validity. 


* For example, an individual constant b can be replaced by a new monadic predicate letter P, 
together with the axiom (4y)(Vx)(P(x) @ x = y). Any axiom .4(b) should be replaced by (Vx)(P(x) 
> (x). 
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Now let us turn to the problem of finding an axiom system whose theo- 
rems are the inclusively valid sentences. We shall adapt for this purpose an 
axiom system PP* based on Exercise 2.28. As axioms we take all the follow- 
ing formulas: 


(Al) 74> (7> A 

(A2) (47> (¢> 2) > (43 49> (4% Y) 

(A3) 7S 72A> (G73 A> 

(A4) (Vx) A(x) > Ay) if 7) is a wf of and y is a variable that is free for x in 
A(x). (Recall that, if y is x itself, then the axiom has the form (Vx). 7> 2. 
In addition, x need not be free in .7(x).) 

(A5) (Vx)(47> 7) > (4> (Vx)7 if contains no free occurrences of x. 

(A6) (yi) «-- (Wy, 4> 2) > [O) «-. (Wy). 7 Wy.) WY.) 


together with all formulas obtained by prefixing any sequence of universal 
quantifiers to instances of (A1)—(A6). 

Modus ponens (MP) will be the only rule of inference. 

PP denotes the pure first-order predicate calculus, whose axioms are (A1)— 
(A5), whose rules of inference are MP and Gen, and whose language contains 
no individual constants or function letters. By Gédel’s completeness theorem 
(Corollary 2.19), the theorems of PP are the same as the logically valid for- 
mulas in PP. Exercise 2.28 shows first that Gen is a derived rule of inference 
of PP*, that is, if Fppy 7, then F pps (Vx) 7, and second that PP and PP* have the 
same theorems. Hence, the theorems of PP* are the logically valid formulas. 

Let PPS* be the same system as PP* except that, as axioms, we take only the 
axioms of PP* that are sentences. Since MP takes sentences into sentences, 
all theorems of PPS* are sentences. Since all axioms of PPS* are axioms of 
PP*, all theorems of PPS* are logically valid sentences. Let us show that the 
converse holds. 


Proposition 2.49 


Every logically valid sentence is a theorem of PPS*. 


Proof 


Let #be any logically valid sentence. We know that is a theorem of PP*. 
Let us show that .7is a theorem of PPS*. In a proof of .7in PP*, let u,, ..., u,, be 
the free variables (if any) in the proof, and prefix (Vu,) ... (Vu, to all steps of 
the proof. Then each step goes into a theorem of PPS*. To see this, first note 
that axioms of PP* go into axioms of PPS*. Second, assume that 7 comes from 
cand 7 => 7by MP in the original proof and that (Vu) ... (Vu,)7 and (Wu) ... 
(Vu,)(7=> 7) are provable in PPS*. Since (Wu)... (Wu,)(7 > 7) > [(Wuy) ... (Wu, 2 
=> (Vu) ... (Vu,,) 7] is an instance of axiom (A6) of PPS*, it follows that (Vu,) ... 
(Vu,)7 is provable in PPS*. Thus, (Vu) ... (Vu,).7is a theorem of PPS*. Then n 
applications of axiom (A4) and MP show that .7is a theorem of PPS*. 
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Not all axioms of PPS* are inclusively valid. For example, the sentence (Vx) f > 
f discussed earlier is an instance of axiom (A4) that is not inclusively valid. So, 
in order to find an axiom system for inclusive validity, we must modify PPS*. 

If P is a sequence of variables 1, ..., u,, then by VP we shall mean the 
expression (Vu,) ... (Vu,). 

Let the axiom system ETH be obtained from PPS* by changing axiom 
(A4) into: 

(A4’) All sentences of the form VP[(Vx). A(x) > A(y)], where y is free for x in 
a(x) and x is free in .A(x), and P is a sequence of variables that includes all 
variables free in .4(and possibly others). 

MP is the only rule of inference. 
It is obvious that all axioms of ETH are inclusively valid. 


Lemma 2.50 


If 7is an instance of a tautology and P is a sequence of variables that 
contains all free variables in. 7 then Fgry VP ~ 


Proof 


By the completeness of axioms (A1)-(A3) for the propositional calculus, there 
is a proof of .~ using MP and instances of (A1)-(A3). If we prefix VP to all 
steps of that proof, the resulting sentences are all theorems of ETH. In the 
case when an original step .4 was an instance of (Al)-(A3), VP.7is an axiom 
of ETH. For steps that result from MP, we use axiom (A6). 


Lemma 2.51 


If P is a sequence of variables that includes all free variables of 7=> 7, and 
Fer WP Zand Fey VPL4=> 7], then Fry VP. 


Proof 


Use axiom (A6) and MP. 


Lemma 2.52 
If P is a sequence of variables that includes all free variables of 4% 7, 4, and 
Fery VPL4> 7] and Fey VP[7> 9], then Fey VPL-4=> 7]. 


Proof 


Use the tautology (75> 7> (7 > 7) > (4%> %), Lemma 2.50, and Lemma 
2.51 twice. 


150 Introduction to Mathematical Logic 


Lemma 2.53 


If x is not free in .vand P is a sequence of variables that contains all free vari- 
ables of .4, Fery VPL7=> (Vx).7]. 


Proof 


By axiom (A5), Fer: VP[(Vx)(B > B) > (B => (Vx) B)]. By Lemma 2.50, Fete 
VP[(Vx)(47=> .”]. Now use Lemma 2.51. 


Corollary 2.54 


If “has no free variables, then Fry 7 => (Vx). 


Lemma 2.55 


If x is not free in and P is a sequence of variables that includes all variables 
free in %, then Fppy VP[A(Vx) f > (Vx). 47> 2). 


Proof 


Ferry VP[.4=> (47> f)] by Lemma 2.50. By Lemma 2.53, Ferry VPI(4 => f) > (Vx) 
(47> f)]. Hence, by Lemma 2.52, Fery VPI. 7=> (Vx)(7=> f)]. By axiom (A6), Fery 
VP[(Wx)(.4 => f) > (Wx).7 > (Vx)f)]. Hence, by Lemma 2.52, Feqy VP[.4 > ((WX).4 
=> (Vx)f)]. Since [> 77> (Vx). 7=> (Vx)f)] > FA(Vx)f > ((Vx).7 > .)] is an instance 
of a tautology, Lemmas 2.50 and 2.51 yield Fgry VPIA(Vx)f > ((Vx).7 > -”)]. 


Proposition 2.56 


ETH + {7(Vx)f} is a complete axiom system for logical validity, that is, a sen- 
tence is logically valid if and only if it is a theorem of the system. 


Proof 


All axioms of the system are logically valid. (Note that (Vx)f is false in all 
interpretations with a nonempty domain and, therefore, —(Vx)f is true in all 
such domains.) By Proposition 2.49, all logically valid sentences are prov- 
able in PPS*. The only axioms of PPS* missing from ETH are those of the 
form WP[(Vx).4 => .4], where x is not free in wand P is any sequence of vari- 
ables that include all free variables of .7. By Lemma 2.55, Fey VP[>(Vx)f > 
((Vx).7 => .”)]. By Corollary 2.54, VP[A(Vx)f] will be derivable in ETH + {7(Vx)f}. 
Hence, VP[(Vx).7> .4] is obtained by using axiom (A6). 
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Lemma 2.57 


If P is a sequence of variables that include all free variables of .4 F gry WPI(Vx) 
f > (Vx). 7 bt), where tis 7 f. 


Proof 


Since f > .vis an instance of a tautology, Lemma 2.50 yields Fry VP(VX)[F > 
4). By axiom (A6), Fare VP [(Vx)[f > .7] > [(Vx)f > (V2). 4]. Hence, Fer VP[(Vx) 
f > (Vx).4] by Lemma 2.51. Since (Vx).7=> [(Vx).7< t] is an instance of a tautol- 
ogy, Lemma 2.50 yields Fer; VP[(Vx).7 > [(Vx).7 © t]]. Now, by Lemma 2.52, 
Fer WP [(Vx)f > [(Vx).7< t]]. 

Given a formula % construct a formula .7* in the following way. Moving 
from left to right, replace each universal quantifier and its scope by t. 


Lemma 2.58 


If P is a sequence of variables that include all free variables of .4, then 
Ferry WP [(Vx) f£ > [Ze 2’ ]]. 


Proof 


Apply Lemma 2.57 successively to the formulas obtained in the stepwise 
construction of .4*. We leave the details to the reader. 


Proposition 2.59 


ETH is a complete axiom system for inclusive validity, that is, a sentence .7is 
inclusively valid if and only if it is a theorem of ETH. 


Proof 


Assume .7is a sentence valid for all interpretations. We must show that Fer 
2. Since 7 is valid in all nonempty domains, Proposition 2.56 implies that 7 
is provable in ETH + {-(Vx)f}. Hence, by the deduction theorem, 


(+) FerH AVx) f =>B. 


Now, by Lemma 2.58, 


(%) Hera (vx)f =>[B <= B*] 
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(Since .4 has no free variables, we can take P in Lemma 2.58 to be empty.) 
Hence, [(Vx)f > [.4< .#]] is valid for all interpretations. Since (Vx)f is valid 
in the empty domain and vis valid for all interpretations, .* is valid in the 
empty domain. But .4* is a truth-functional combination of ts. So, .7* must 
be truth-functionally equivalent to either t or f. Since it is valid in the empty 
domain, it is truth-functionally equivalent to t. Hence, Fgry .7*. Therefore by 
(%), Fer (Wx)f > .7. This, together with (+), yields Fgry .7. 

The ideas and methods used in this section stem largely, but not entirely, 
from a paper by Hailperin (1953).* That paper also made use of an idea in 
Mostowski (1951b), the idea that underlies the proof of Proposition 2.59. 
Mostowski’s approach to the logic of the empty domain is quite different 
from Hailperin’s and results in a substantially different axiom system for 
inclusive validity. For example, when .7 does not contain x free, Mostowski 
interprets (Vx).7 and (4x).7 to be .7 itself. This makes (Vx)f equivalent to f, 
rather than to t, as in our development. 


* The name ETH comes from “empty domain” and “Theodore Hailperin.” My simplification of 
Hailperin’s axiom system was suggested by a similar simplification in Quine (1954). 
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Formal Number Theory 


3.1 An Axiom System 


Together with geometry, the theory of numbers is the most immediately intui- 
tive of all branches of mathematics. It is not surprising, then, that attempts to 
formalize mathematics and to establish a rigorous foundation for mathemat- 
ics should begin with number theory. The first semiaxiomatic presentation of 
this subject was given by Dedekind in 1879 and, ina slightly modified form, 
has come to be known as Peano’s postulates.* It can be formulated as follows: 


(P1) 0 is a natural number+* 


(P2) If x is a natural number, there is another natural number denoted 
by x’ (and called the successor of x).t 

(P3) 0 4 x’ for every natural number x. 

(P4) If x’=y', then x=y. 

(P5) If Q is a property that may or may not hold for any given natural 
number, and if (I) 0 has the property Q and (II) whenever a natural 
number x has the property Q, then x’ has the property Q, then 
all natural numbers have the property Q (mathematical induction 
principle). 


These axioms, together with a certain amount of set theory, can be used to 
develop not only number theory but also the theory of rational, real, and 
complex numbers (see Mendelson, 1973). However, the axioms involve cer- 
tain intuitive notions, such as “property,” that prevent this system from 
being a rigorous formalization. We therefore shall build a first-order theory 
S that is based upon Peano’s postulates and seems to be adequate for the 
proofs of all the basic results of elementary number theory. 

The language ~, of our theory S will be called the language of arithmetic. 
7, has a single predicate letter Aj. As usual, we shall write t = s for Ai(t,s). 


* For historical information, see Wang (1957). 
* The natural numbers are supposed to be the nonnegative integers 0, 1, 2, .... 
+ The intuitive meaning of x’ is x + 1. 
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7, has one individual constant a,. We shall use 0 as an alternative notation 
for a,. Finally, 7, has three function letters, f', f7,and f?, We shall write (¢’) 
instead of f/(t), (f+) instead of f7(t,s), and (t-s) instead of f?(t, s). However, 
we shall write t’, t + s, and t-s instead of (t’), (f + s), and (t-s) whenever this 
will cause no confusion. 

The proper axioms of S are 


(SL) x) =X > (KX, = X35 > Xp = X3) 
(S2) x, = 2% > x! =X,’ 

(S3) 04 x,' 

(S4) x)! = x,’ > xX, =X 

(S5) x, +0=x, 

(S6) x oe = (x1 + Xp)’ 

(S7) x, 

(S8) 2x1 -(%2) = (%1-%2)+.%4 


) 
(S9) 40) > (Wx). 4) > 2X’) > (VX). A(X) for any wf .4(x) of S. 

We shall call ($9) the principle of mathematical induction. Notice that axioms 
(S1)-(S8) are particular wfs, whereas (S9) is an axiom schema providing an 
infinite number of axioms.” 

Axioms (S3) and (S4) correspond to Peano postulates (P3) and (P4), respec- 
tively. Peano’s axioms (P1) and (P2) are taken care of by the presence of 0 as 
an individual constant and fi as a function letter. Our axioms (S1) and (S2) 
furnish some needed properties of equality; they would have been assumed 
as intuitively obvious by Dedekind and Peano. Axioms (S5)-(S8) are the 
recursion equations for addition and multiplication. They were not assumed 
by Dedekind and Peano because the existence of operations + and - satisfy- 
ing (S5)-(S8) is derivable by means of intuitive set theory, which was presup- 
posed as a background theory (see Mendelson, 1973, Chapter 2, Theorems 
3.1 and 5.1). 

Any theory that has the same theorems as S is often referred to in the lit- 
erature as Peano arithmetic, or simply PA. 

From (S89) by MP, we can obtain the induction rule: 


B (0),(vx)( a(x)=> 2(x')) k, (Wx).4 (x). 


It will be our immediate aim to establish the usual rules of equality; that is, 
we Shall show that the properties (A6) and (A7) of equality (see page 93) are 
derivable in S and, hence, that S is a first-order theory with equality. 


* However, (59) cannot fully correspond to Peano’s postulate (P5), since the latter refers intui- 
tively to the 2“ properties of natural numbers, whereas (S9) can take care of only the denu- 
merable number of properties defined by wfs of 74. 
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First, for convenience and brevity in carrying out proofs, we cite some 
immediate, trivial consequences of the axioms. 


Lemma 3.1 


For any terms t, s, r of 4, the following wfs are theorems of S. 


(S’) 
(S2') 
(S3’) 
(S4’) 
(S5’) 
(S6') 
(S7’) 
(S8’) 


r>(t=s>re=s) 
r>t=r' 
0 


, 


t 
t= 

es 
t — 


Rie 


2 Gast 
0 


ae 
t+r 
t-0= 
t-r’=(t-r)+t 


Proof 


(S1’')-(S8’) follow from (S1)-(S8), respectively. First form the closure by means 
of Gen, use Exercise 2.48 to change all the bound variables to variables not 
occurring in terms ¢, 7, s, and then apply rule A4 with the appropriate terms 
t, 7, 5* 


Proposition 3.2 


For any terms tf, s, r, the following wfs are theorems of S. 


t=r>s+t=s+r 
(f+r)t+s=t+(r+5) 


a. t=t 

b. t=r>re=t 

ec t=r>(r=s>te=s) 
d. r=t>(s=t>re=s) 
e. f=r>t+s=rt+s 
f. t=O+¢t 

g. t+r=(t+n! 

h. t+r=rt+t 

i. 

j. 


* The change of bound variables is necessary in some cases. For example, if we want to obtain 
Xp =X, > x, = x,' from x, = xX) > x,' = x,', we first obtain (Vx,)(Vx2)(x1 = X) > xy = x,’). We can- 
not apply rule A4 to drop (Vx,) and replace x, by x2, since x is not free for x, in (Vx3)(X, =X) > 
X,' = x,’). From now on, we shall assume without explicit mention that the reader is aware that 
we sometimes have to change bound variables when we use Gen and rule A4. 
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t=r>t-s=r-s 
0-t=0 
t-r=t-rt+r 
t-r=r-t 
t=r>s-t=s-r 
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1. t+0=t (S5’) 
2. (+0=H)>¢+0=t>t=f) (I) 
3. t+0=t>tH=t 1,2, MP 
4. t=t 1,3, MP 
1. t=r5>(¢=t>re=t) (S1’) 
2. t=t>(=r>re=t) 1, tautology, MP 
3. t=r>a>re=t 2, part (a), MP 
1 r=t>(r=s>t=s) (S1’) 
2. t=r>r=t Part (b) 
3. t=r>(r=s>te=s) 1, 2, tautology, MP 
1 r=t>(t=s>re=s) Part (c) 
2. t=s>(r=t>re=s) 1, tautology, MP 
3. s=t>t=s Part (b) 
4. s=t>(r=t>re=s) 2, 3, tautology, MP 
. Apply the induction rule to.47(@):x=yS>x+z=y+z. 
ip 1. x+0=x (S5’) 
2. y+O=y (S5’) 
3. x=y Hyp 
4. x+0=y 1,3, part (c), MP 
5. x+0=y+0 4,2, part (d), MP 
6. kex=y>x+0=y+0 1-5, deduction theorem 
Thus, Fg .7(0). 
li 1. x=ySxux+z=yt+Z Hyp 
2. x=y Hyp 
3. x+2z'=(x +2) (S6’) 
4. y+z'=(y+2) (S6’) 
5. X+Z=Y+Z 1,2, MP 
6. (x +z) =(yt+z) 5, (S2’), MP 
7% x+Zz'=(y+2)' 3, 6, part (c), MP 
8. x+2’=ytz’ 4,7, part (d), MP 
9. Fox =y>x+z=y+z)> 1-8, deduction theorem twice 


(x=y>x+z'=yt+Z’) 
Thus, Fs .4(2@) > .4(z’), and, by Gen, F.(Vz)(.4(@) > .4(2’)). Hence, 
+.(Vz).4(z) by the induction rule. Therefore, by Gen and rule A4, 
Fot=r>t+s=rts. 
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f. Let a(x) bex=0+-x. 
i. F,0=0+0 by (85’), part (b) and MP; thus, t, .7(0). 


ii. 1. x=O+% Hyp 
2. 0+x'=(0 +x) (S6’) 
3. x = (042) 1, (S2’), MP 
4. x’ =0+2' 3, 2, part (d), MP 
5. Fox =O04+x%>%x'=042' 1-4, deduction theorem 


Thus, ks.7(x) > .#(x’) and, by Gen, Fs (Vx). 4(x) > .7(x’)). So, by (i), 
(ii) and the induction rule, , (Vx)(x = 0 + x), and then, by rule A4, 


Foft=O +t. 
g. Let ay) De ro =(x + yy’. 
i 1. x’ 4+0=2' (S5’) 
2. x+0=x (S5’) 
3. (x +0) =x’ 2, (S2’), MP 
4. x'+0=(x + 0) 1,3, part (d), MP 
Thus, Fg .7(0). 
ii, 1. x’ +y=(+H+y) Hyp 
2. x’ +y' =(x'+ yy’ (S6’) 
3. (x+y) =(xt+y)" 1, (S2’), MP 
4. x'+y'=(x+y)" 2, 3, part (c), MP 
5. x+y =(x+yy (S6’) 
6. ty’)! =(x+y)" 5, (S2’), MP 
7 x+y =(x+y') 4, 6, part (d), MP 
8. Fex’ +y=(x+y) > 1-7, deduction theorem 
x+y = (ety) 


Thus, ks .7(y) > .4(y’), and, by Gen, Fs (Vy). “(y) > -7(y’)). Hence, by 
(i), (ii), and the induction rule, Fs (Vy)(x’ + y =(x + y)’). By Gen and 
rule A4,b. tt’ +r=(+7). 

h. Let s(y)bex+y=ytx. 


ip 1. x+0=x (S5’) 
2. x=O+%X Part (f) 
3. x+0=04+%x 1, 2, part (©), MP 
Thus, Fg, .7(0). 
li 1. x+y=yt+x Hyp 
2. x+y! =(x+ y)' (S6’) 
3. y +x =(y + x) Part (g) 
4. (x+y) =(yt+ x)’ 1, (S2’), MP 
5. xty'=(yt+x) 2, 4, part (c), MP 
6. x+y =y' +x 5, 3, part (d), MP 
7 bexty=ytx> 1-6, deduction theorem 


x+y =y'+x 
Thus, Fs .7(y) > .4(y’) and, by Gen, Fs (Vy)(4(y) > .7(y)’). So, by (i), 
(ii) and the induction rule, Fs (Vy)(x + y = y + x). Then, by rule A4, 
Gen and rule A4,hof+r=rt+t. 
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2 
3 
4 
5. 
6 
8. 
i 


1 
7. 
L 


t=r>t+s=re+s 


. t+s=st+t 
. r+s=s4+r 
. t=r 
t+s=rt+s 
. stt=rt+s 
s+t=s+r 


Fot=r>st+t=s+r 


et 2(z) be(x«+y)+z2=x+(y +2). 


1 @&t+y+0=x+y 

2. y+O=y 

3. x+(yt+O)=x+y 

4. (x+y)+0=x+(y+0) 


) 
Thus, Fg .7(0). 
ii. ) 


lL wty)+z=x+(y+2) 


2. (x+y) +z’ =(x+ y) +2)’ 

3. (x + y) +2)’ =(« +(y + 2) 
4. (x+y) +2’ =(x +(y + 2) 

5. y+ Zz’ =(y +2)’ 

6. x+(ytzZ)=Hx+(yt2) 

7 x+(yt+2) =(x+(yt+2z2))’ 
8 x+(ytz’)=(x+(yt+z))’ 
9 (xt+y)+z=x+(y+Z’) 
10. Fe(xtyt+z=x+(y+2Q> 


(x+y)t+z'=x+(ytz’) 
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Part (e) 

Part (h) 

Part (h) 

Hyp 

1, 4, MP 

2, 5, (S1’) MP 

6, 3, part (c), MP 

1-7, deduction theorem 


(S5’) 

(S5’) 

2, part (j), MP 

1, 3, part (d), MP 


Hyp 

(S6’) 

1, (S2’), MP 

2,3, part (c), MP 

(S6’) 

5, part (i), MP 

(S6’) 

6, 7, part (C), MP 

4, 8, part (d), MP 

1-9, deduction theorem 


Thus, kg .4(2) > .4(z’) and, by Gen, ks (Vz): 4(2) > (4(Z’)). So, by (i), (ii) and 
the induction rule, k, (Vz).4(z), and then, by Gen and rule A4, (f+ 1) +s = 


t+(r+s). 


Parts (k)—(0) are left as exercises. 


Corollary 3.3 


S is a theory with equality. 


Proof 


By Proposition 2.25, this reduces to parts (a)—(), (i), (k) and (0) of proposition 
3.2, and (S2’). 


Notice that the interpretation in which 


a. The set of nonnegative integers is the domain 
b. The integer 0 is the interpretation of the symbol 0 
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c. The successor operation (addition of 1) is the interpretation of the ' 
function (that is, of f;) 

d. Ordinary addition and multiplication are the interpretations of + 
and - 

e. The interpretation of the predicate letter = is the identity relation 


is a normal model for S. This model is called the standard interpretation or 
standard model. Any normal model for S that is not isomorphic to the stan- 
dard model will be called a nonstandard model for S. 

If we recognize the standard interpretation to be a model for S, then, of 
course, S is consistent. However, this kind of semantic argument, involving 
as it does a certain amount of set-theoretic reasoning, is regarded by some 
as too precarious to serve as a basis for consistency proofs. Moreover, we 
have not proved in a rigorous way that the axioms of S are true under the 
standard interpretation, but we have taken it as intuitively obvious. For these 
and other reasons, when the consistency of 5 enters into the argument of a 
proof, it is common practice to take the statement of the consistency of S as 
an explicit unproved assumption. 

Some important additional properties of addition and multiplication are 
covered by the following result. 


Proposition 3.4 
For any terms tf, 1, s, the following wfs are theorems of S. 


a. t-(r+s)=(t-r)+(t-s) distributivity) 
b. (r+s)-t=(r-t)+(s-t) distributivity) 

c. (t-r)-s =t-(r-s) (associativity of -) 

d. t+s=r+s=>t=r (¢ancellation law for +) 


Proof 


Prove kx - (y + Zz) = (x- y) + (x- z) by induction on z. 

Use part (a) and Proposition 3.2(n). 

Prove k,(x- y)-z=x-(y-z) by induction on z. 

Prove ox +z=y+z=>x=y by induction on z. This requires, for the 
first time, use of (S4’). 


ooo 


The terms 0, 0’, 0", 0”, ... we shall call numerals and denote by 0,1,2,3, ... 
More precisely, 0 is 0 and, for any natural number 1, +1 is (77)’. In general, 
if n is a natural number, 7 stands for the numeral consisting of 0 followed 
by n strokes. The numerals can be defined recursively by stating that 0 is a 
numeral and, if vu is a numeral, then wu’ is also a numeral. 
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Proposition 3.5 


The following are theorems of S. 


See wo aos p 


Proof 


Soq mo 


. t+5 


t+s=0>1t=0As=0 
t#0>(s-t=0>s=0) 


1>(t=0as=1)v(t=1as 


Peo i Sst] TAs) 


. £#40> Gy =y’) 


s#0>(t-s=r-s>te=n) 


. £405¢41> Gyt=y") 


t+0’=(+0Y 
t+O=t 
(§+ 0) =t' 
t+0'=t' 
t+1l=f 
t-O'=t-O+t 
t-0=0 
t-0+t=O0+t 


PR NOP NS SP Ge Nom Sle aN oe 
> 
2 
I 
oO 
+ 
os 


5. t-2=t+t 
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0) 


(S6’) 

(S5’) 

2, (S2’), MP 

1,3, Proposition 3.2(c), MP 
4, abbreviation 

(S8’) 

(S7’) 

2, Proposition 3.2(e), MP 
1,3, Proposition 3.2), MP 
Proposition 3.2(f, b), MP 
4,5, Proposition 3.2(c), MP 
6, abbreviation 

(S8’) 

Part (b) 

2, Proposition 3.2(e), MP 


1,3, Proposition 3.2), MP 
4, abbreviation 


. Let zy) bex+y=05x=0Ay=0. It is easy to prove that F, .7(0). 


Also, since F(x + y)’ # 0 by (S3’) and Proposition 3.2(b), it follows 
by (S6’) that kx + y’ # 0. Hence, fk .7(y’) by the tautology =A > 
(A => B). So, k.4(y) > Ay’) by the tautology A > (B => A). Then, by 
the induction rule, Fs (Vy).4(y) and then, by rule A4, Gen and rule 


A4, we obtain the theorem. 


. The proof is similar to that for part (d) and is left as an exercise. 


Use induction on y in the wf x+ y=1>((x=0Ay=1)v(x=1Ay=0)). 


. Use induction ony inx-y=1=>(x=1ay=1). 
. Perform induction on x inx 40 > (Qw\(x = w’). 
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i. Let sy) be (Vx)(z 405 (x-z=y-z>x=Y)). 


i. 1. 


21. 


NAM AWN 


z#0 


Lbgz #05 (x-2=0-2>x=0) 
. Fe (Vz)\(z 40> (x-z=0-z 


=>x=0)) 
Thus, Fk, .4(0). 


. WZ 40>5 (x-z=y-z>x=Yy)) 


z#0 
X*Z=y'°Z 
y' #0 
y’-z#0 


.x-z#0 


x#0 


. (Aw)(x = w’) 


x=)’ 


Wo zy Zz 

J b-Z4+Z=y°Z4+2Z 
.b-Z=y-Z 
.Z2#05(0-z=y-z>b=y) 
.b-z=y-z>b=y 


. x=y’ 
. AY), Z#0,%-Z2=Yy' -Zeox=y’ 


“y) sz 40> 
(x-z=y-z>x=y’) 
Z(y) Fe(Wx)(z #0 > 
(x-z=y'-z>x=y’) 
Fs Aly) > 4(y’) 
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Hyp 
Hyp 
Proposition 3.2(1) 
2, 3 Proposition 3.2(c), MP 
1, 4, part(e), MP 


1-5, deduction theorem 
6, Gen 


Hyp (4) 

Hyp 

Hyp 

(S3’), Proposition 3.2(b), MP 
2,4, part @), a tautology, MP 

3, 5, (S1’), tautologies, MP 

6, (S7’), Proposition 3.2(0, n), 
(S1’), tautologies, MP 

7, part (h), MP 

8, rule C 

3, 9, (A7), MP 

10, Proposition 3.2(m, d), MP 
11, Proposition 3.4(d), MP 

1, rule A4 

2, 13, MP 

12, 14, MP 

15, (S2’), MP 

9, 16, Proposition 3.2(c), MP 
1-17, Proposition 2.10 

18, deduction theorem twice 


19, Gen 


20, deduction theorem 


Hence, by (i), (ii), Gen, and the induction rule, we obtain + ,(Vy).7(y) 
and then, by Gen and rule A4, we have the desired result. 
j. This is left as an exercise. 


Proposition 3.6 


a. Let mand be any natural numbers. 


i. Ifm#n,thenk m#n. 


il. 


bkjm+n=m+nandk m-n=m-nN. 
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b. Any model for S is infinite. 
c. For any cardinal number &,,S has a normal model of cardinality &s. 


Proof 


a. i. 


Assume m # n. Either m <n orn < m. Say, m <n. 


1. m=n Hyp 
m times n times 
ECAR TRA 
26. Ql = Fat 1 is an abbreviation of 2 


n—m times 
ee ea 


3. Apply (S4’) and MP m times in a row. We get 0= 0"...’ Let be 
n—-m-—1.Sincen >m,n-—m —1 > 0. Thus, we obtain 0 = t’. 


4. OF?’ (S3’) 

5. O=t ADF? 3, 4, conjunction introduction 
6. M=nNOV=taAdet 1-5 

7 bemen 1-6, proof by contradiction 


A similar proof holds in the case when n < m. (A more rigor- 
ous proof can be given by induction in the metalanguage with 
respect to 1.) 


. We use induction in the metalanguage. First, m+0 is m. 


Hence, 's m+0= m+0 by (S5’). Now assume by m+n=m+i. 
Then ts(m+n) =m+(n) by (62’) and (S6’). But m+(n+1) 


is (m+n) and n+1 is (7). Hence, ty m+(n+1)=m-+(n+1). 


Thus, fj m+n=m-+n. The proof that bs m-n=m-n is left as an 
exercise. 


b. By part (a), (i), in a model for S the objects corresponding to the 
numerals must be distinct. But there are denumerably many 
numerals. 

c. This follows from Corollary 2.34(c) and the fact that the standard 
model is an infinite normal model. 


An order relation can be introduced by definition in S. 


Definitions 


In the first definition, as usual, we choose w to be the first variable not in t or s. 


t<s for (dw)(w#0Aw+t=s) 
t<s fort<svt=s 

t>s fors<t 

t>s fors<t 

t¢s for-(t<s),andsoon 
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Proposition 3.7 
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For any terms tf, 1, s, the following are theorems. 


t¢t 
t<s>(s<rot<n) 
t<s>stt 
t<s@tt+r<st+r 
t<t 
t<s>(s<r>t<r) 
t<soett+r<st+r 
t<ssS(s<rst<n 
O<t 

0<t’ 

t<ret'<r 
t<ret<r'’ 

t<t’ 
0<1,1<2,2<3,... 
t#ér>(t<rvr<t) 
t=rvt<rvr<t 
t<rvr<t 

t+r>t 
r#0>¢t+r>t 
r#z0>¢t-r>t 
r#0er>0 
r>0>((>0>17r-t>0) 
r4#05>(¢>15¢t-r>n 
r#0>(<set-r<s-n 
r#£0>(<set-r<s-r) 
t#0 
t<rAr<t>ter 


NNGX Es few nO pDoB Beni. SO wo ao op 


Proof 

a. 1. t<t 
2. (Aw\(wAOAW+t=F) 
3. bDA0AbD+t=t 
4. b+t=t 
5. t=O+t 
6. b+t=O0+t 
7. b=0 
8. b#0 
9. b=0ALF0 


10. 0=0A040 
1. E<ths0=0A0¢0 
12. bettt 


Hyp 

1 is an abbreviation of 2 

2, rule C 

3, conjunction rule 
Proposition 3.2(f) 

3, 4, Proposition 3.2(c), MP 
6, Proposition 3.4(d), MP 

3, conjunction elimination 

7, 8, conjunction elimination 
9, tautology: B A =B > C, MP 
1-10, Proposition 2.10 

1-11, proof by contradiction 
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1. t<s Hyp 
2. 8<r Hyp 
3. (Aww FO0AwW+t=s) 1 is an abbreviation of 3 
4. Gv)v 40AV+S=7") 2 is an abbreviation of 4 
5. bA0Ab+t=s 3, rule C 
6. c#OAcC+S=r 4,ruleC 
7. b+t=s 5, conjunction elimination 
8. c+s=r 6, conjunction elimination 
9. c+(b+fh=c+s 7, Proposition 3.2(i), MP 
10. c+(b+H=r 9, 8, Proposition 3.2(c), MP 
ll. (C+b)+t=r 10, Proposition 3.2(j, c), MP 
12. b#0 5, conjunction elimination 
13. c+b40 12, Proposition 3.5(d), tautology, MP 
14. c+bD#40A(cC+))+t=r 13, 11, conjunction introduction 
15. (duu#OAuUt+t=7) 14, rule E4 
16. t<r Abbreviation of 15 
17, ket<s>(s<rst<r) 1-15, Proposition 2.10, 


deduction theorem 


Parts (c)-(z’) are left as exercises. 


Proposition 3.8 


a. For any natural number k,k, x=Ov...vx=k @x<k. _ 

a’. For any natural number k and any wf.4,15 .4(0)A. 4(1)A... A 4k) 
(Vx)(x<k > F(x)). a a 

b. For any natural number k>0,b5 x=Ov...vx=(k-l)ox<k 

b’.. For any natural number k > O and any wf 4,1k.7(0)A 
BAA. AAk-1IS (Vx)(x <k > F(x). 

C.F (Wxye <y > 4 (x) A (WX) zy > @ (x) > W)C4 @) V © OO) 

Proof 
a. We prove k x=0v...vx=k @x<k by induction in the metalan- 


guage on k. The case for k=0,K5 x=0 <x <0, is obvious from the 
definitions and Proposition 3.7, Assume as inductive hypothesis 
bp x=0v...vx=k<ax<k. Now assume x=0v...vx=kvx=ke4+1. 
But tjx=k+1l>x<k+1 and, by the inductive hypothesis, 
bbx=O0vV...vx=k>x<k. Alsok x<k—2x<k+1. Thus, x<k+1.So, 
bpx=Ov...vx=kvx=k+1>x<k+1. Conversely, assume x <k +1. 
Then x=k+1lvx<k+l1. If x=k+1, then x=0v VX =kvx=kt1. 
If x<k+1, then since k+1 is (k)', we have x<k by Proposition 
3.7(1). By the inductive hypothesis, x =Ov ...vx =k, and, therefore, 
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x=0v...vx=kvx=k+1. In either case, x=Ov... vx=kvx=k41. 
This proves fs x <k4+1>x=0vV...vx=kvx=ke1. From the induc- 
tive hypothesis, we have derived kyx=Ov...vx=k+1loxsk+1 
and this completes the proof. (This proof has been given in an infor- 
mal manner that we shall generally use from now on. In particular, 
the deduction theorem, the eliminability of rule C, the replacement 
theorem, and various derived rules and tautologies will be applied 
without being explicitly mentioned.) 


Parts (a’), (b), and (b’) follow easily from part (a). Part (c) follows almost imme- 
diately from Proposition 3.7(o), using obvious tautologies. 

There are several other forms of the induction principle that we can prove 
at this point. 


Proposition 3.9 


a. Complete induction. F,(Wx)((Wz)(z <x > 2(Z)) > 2(X)) > (Vx).7 (0). In ordi- 
nary language, consider a property P such that, for any x, if P holds for 
all natural numbers less than x, then P holds for x also. Then P holds 
for all natural numbers. 

b. Least-number principle. F. (Ax). 4(x) > Ay). 7(y) A (W2)z < y > 7A(2). Ifa 
property P holds for some natural number, then there is a least number 
satisfying P. 


Proof 


a. Let 7(x) be (Wz)(z <x > A(2)). 


i 1. (Wx\(V2z)(z <x > 472) > F(X) Hyp 
2. (Wz)(z<0> A(z) > 70) 1, rule A4 
3. z <0 Proposition 3.7(y) 
4. (Wz\(z<0> AZ) 3, tautology, Gen 
5. (0) 2,4, MP 
6. (Vz(z<0> 4@)ie, 7(0) 5, Proposition 3.8(a’) 
7. (Wx)(Vz)(z < x > .A(2)) 1-6 
=> A(x) Fs 7 (0) 
ii. 1. (Vx)(V2)(zZ <x > (2) > 7X) Hyp 
2. 7(x), ie, (Vz\(zZ <x => 42) Hyp 
3. (Wz\(z < x’ > A(z) 2, Proposition 3.7(€) 
4. (W2)(z<x' > 4(2) > A(X’) 1, rule A4 
5. A(x’) 3, 4, MP 
6. 25x > 2<x VzZ=x' Definition, tautology 
7 Z<x' => Zz) 3, rule A4 
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8. zZ=x >.Aa(2) 5, axiom (A7), Proposition 
2.23(b), tautologies 
9. (Wz\(z <x’ > A(z) ie, 7(x’) 6, 7, 8, Tautology, Gen 


10. (Vx)(Wz)(z < x > 2(2)) > 70) 1-9, deduction theorem, Gen 
Fs (Wx)(v (x) > 7 (x')) 


By (i), (ii), and the induction rule, we obtain 7 kg (Vx) (x), that is, 7 Fs (Vx) 
(Vz)(z < x > A(z), where 7 is (Vx)(V2)(z < x > .4(2) > .4(x)). Hence, by rule 
A4 twice, JF 5x <x > .4(x). But Fox < x. S0, 7 Fs .4(x), and, by Gen and the 
deduction theorem, ks 7 => (Vx).7(x). 


b. 1. yy) a (v2) 


@<y>-7@)) Hyp 
2. (Wy) aC4(y) A (V2) 1, derived rule for negation 
Z<y>-~4@)) 
3. (WY)\(V2zZ <y> 2, tautology, replacement 
5A) > Ay) 
4. (Wy) 74(y) 3, part (a) with — instead of .7 
5. 7(Ay). Ay) 4, derived rule for negation 
6. -7(Ax). A(x) 5, change of bound variable 
7. Fs ay) Ay) A (WAZ <y > 1-6, deduction theorem 
7A (Z))) > 7(Ax).A(X) 
8. F.(Ax).A(x) > Gy). A(y) A (V2) 7, derived rule 


<y> 77) 


Exercise 


3.1 (Method of infinite descent) 


Prove Fg (Wx)(-4(x) > (Ay)(y < x A AY) > (WX) 7A(x) 
Another important notion in number theory is divisibility, which we now 
define. 


Definition 
t|s for (Az)(s = ft - z). (Here, z is the first variable not in t or s.) 
Proposition 3.10 


The following wfs are theorems for any terms f, s, r. 
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c. t|0 
d. t|sAs|r>tlr 
e. s#O0At|s>t<s 
f. tlsAs|t>s=t 
g. tls >t\(r-s) 
h. t|sAt|r>t|(st+7 
Proof 
a. t=t-1. Hence, ¢|t. 
b. t=1-t. Hence, 1|t. 
c. eee ae 
d. Ifs=t-zandr=s-w,thenr=t-(z-w). 
e. Ifs#0and t|s, thens =f -z for some z. If z=0, then s = 0. Hence, z #0. 


So, z=u' forsome u. Thens=t-(u’)=t-ut+t>t. 
f-h. These proofs are left as exercises. 


Exercises 


3.2 Prove kt|1>t=1. 
3.3 Prove g(t|s A t|s’) > t=1. 


It will be useful for later purposes to prove the existence of a unique quo- 


tient and remainder upon division of one number x by another nonzero 
number y. 


Proposition 3.11 


bs y #0 = (au)(S0)[x=y-uto0av<y 


A(Vi1) (V1) (X= YW + VAT, <Y) SU = AV=%)] 


Proof 


Let A(x) bey 40 > (AWGd(x =y-u+vAv<y). 


i 1 y#0O Hyp 
2. O=y-0+0 (S5’), (S7’) 
3. O<y 1, Proposition 3.7(t) 
4. 0=y-0+0A0<y 2, 3, conjunction rule 
5. (Au(avO=y-ut+vAv<y) 4, rule E4 twice 
6. y#0> GuavO=y-u+vav<y) 1-5, deduction theorem 
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ii, 1. 2), ie, y #0 > (Au)Av) Hyp 
(X=Y-U+VUAV<Y) 
2. y#0 Hyp 
3. (Aulavi(x=y-utvAv<y) 1, 2, MP 
4. x=y-atbab<y 3, rule C twice 
5. b<y 4, conjunction elimination 
6. b’<y 5, Proposition 3.7(k) 
7. b'<yvb'=y 6, definition 
8 b'<yse=y-atb'ab'<y) 4, (S6’), derived rules 
9. b' <y> (AWAvx' =y-u+vAv<y) 8, rule E4, deduction theorem 
10. b=y>x'=y-aty-1 4, (S6’), Proposition 3.5(b) 
Hh Vays ay: (a+ 1) 10, Proposition 3.4, 2, 
+0A0<y) Proposition 3.7(t), (S5’) 
12. b’=y > (u)(av\@’ =y-u 11, rule E4 twice, deduction 
+VUAU<Y) theorem 
13. (Au)(Av\(x’ =y-u+vAv<y) 7,9, 12, disjunction elimination 
14. Ax) > (y 40> (Aw) 1-13, deduction theorem 


(x'=y-Ut+vVAV<Y)), 
Le., 2(x) > A(x’) 


By (i), (ii), Gen and the induction rule, kg (Vx).7(x). This establishes the exis- 
tence of a quotient u and a remainder v. To prove uniqueness, proceed as 
follows. Assume y #0. Assume x=y-utvAu<yandx=y-u,t+0,AU,<y. 
Now, u = uy; or u < uy, or u, < u. Tf u =u, then v = v, by Proposition 3.4(d). If 
u<u,, then u,=u+w forsomew #0. Theny-ut+tv=y-(ut+w)t+o,=y-ut 
y: w+, Hence,v=y-w+v,.Sincew #0, y-we>y.S0,v=y-wt+u,2Yy, 
contradicting v < y. Hence, u < u,. Similarly, u, 4 u. Thus, u = u,. Since y+ u + 
V=X=Y-U,+20,, it follows that v = 7. 

From this point on, one can generally translate into S and prove the results 
from any text on elementary number theory. There are certain number-the- 
oretic functions, such as x¥ and x!, that we have to be able to define in S, 
and this we shall do later in this chapter. Some standard results of number 
theory, such as Dirichlet’s theorem, are proved with the aid of the theory of 
complex variables, and it is often not known whether elementary proofs (or 
proofs in S) can be given for such theorems. The statement of some results 
in number theory involves nonelementary concepts, such as the logarithmic 
function, and, except in special cases, cannot even be formulated in S. More 
information about the strength and expressive powers of S will be revealed 
later. For example, it will be shown that there are closed wfs that are neither 
provable nor disprovable in S, if S is consistent; hence there is a wf that is 
true under the standard interpretation but is not provable in S. We also will 
see that this incompleteness of S cannot be attributed to omission of some 
essential axiom but has deeper underlying causes that apply to other theo- 
ries as well. 
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Exercises 


3.4 Show that the induction principle (S9) is independent of the other axi- 
oms of S. 
3.5° a. Show that there exist nonstandard models for S of any cardinality Xa. 
b. Ehrenfeucht (1958) has shown the existence of at least 2“° mutually 
nonisomorphic models of cardinality Na. Prove the special case that 
there are 2“ mutually nonisomorphic denumerable models of S. 
3.69 Give a standard mathematical proof of the categoricity of Peano’s pos- 
tulates, in the sense that any two “models” are isomorphic. Explain 
why this proof does not apply to the first-order theory S. 
3.7> (Presburger, 1929) If we eliminate from S the function letter fe for mul- 
tiplication and the axioms (S7) and (S8), show that the new system S, is 
complete and decidable (in the sense of Chapter 1, page 27). 
3.8 a. Show that, for every closed term t of S, we can find a natural num- 
ber n such thatk, t=n. 
b. Show that every closed atomic wf tf = s of S is decidable—that is, 
either o,f =sorbst #s. 
c. Show that every closed wf of S without quantifiers is decidable. 


3.2 Number-Theoretic Functions and Relations 


A number-theoretic function is a function whose arguments and values are nat- 
ural numbers. Addition and multiplication are familiar examples of number- 
theoretic functions of two arguments. By a number-theoretic relation we mean 
a relation whose arguments are natural numbers. For example, = and < are 
binary number-theoretic relations, and the expression x + y < z determines a 
number-theoretic relation of three arguments.* Number-theoretic functions 
and relations are intuitive and are not bound up with any formal system. 

Let K be any theory in the language », of arithmetic. We say that a num- 
ber-theoretic relation R of n arguments is expressible in K if and only if there 
isa wf .7(x,, ..., x,) of K with the free variables x,, ..., x,, such that, for any 
natural numbers k,, ..., k,,, the following hold: 


1. IfR(, ...,k,) is true, then hx 7 (ki, «.-, kn). 
2. IfR(k, ...k,) is false, then bk 7 (ky... ky 


). 


* We follow the custom of regarding a number-theoretic property, such as the property of 
being even, as a “relation” of one argument. 
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For example, the number-theoretic relation of identity is expressed in S by 
the wf x, = x. In fact, if k, = k,, then k, is the same term as k, and so, by 
Proposition 3.2(a), bs ky =k». Moreover, if k, # k,, then, by Proposition 3.6(a), 
bs ky #ky. 

Likewise, the relation “less than” is expressed in S by the wf x, < x. Recall 
that x, < x, is (Ax3)(v3 # 0 A x3 + X, = X2). If k, < kj, then there is some nonzero 
number n such that k, = n + k,. Now, by Proposition 3.6(a)(ii), bs ky =n +k. 
Also, by (S3’), since n # 0, gt # 0. Hence, by rule E4, one can prove in S the wf 
(Aw) (w 40Awtk = ks); ; that is, by k, <k,. On the other hand, if k, ¢ k,, then 
k, <k, ork, =k,. If k, < k,, then, as we have just seen, by ky < ky. If Ee = k,, then 
bs ko =k. In eee case, by ky < k, and then, by Proposition 3.7(a,c), bs ki ¢ ko. 

Observe that, if a relation is expressible in a theory K, then it is expressible 
in any extension of K. 


Exercises 


3.9 Show that the negation, disjunction, and conjunction of relations 
that are expressible in K are also expressible in K. 


3.10 Show that the relation x + y = z is expressible in S. 


Let K be any theory with equality in the language , of arithmetic. A num- 
ber-theoretic function f of n arguments is said to be representable in K if and 
only if there is a wf .7(x,, ..., X,, y) of K with the free variables x1, ..., x, y such 
that, for any natural numbers k,, ..., k,,m, the following hold: 


1. Tf fk, ...,k,) =m, then tk 7 (ki, ..., kui). 
2s kK Ay) B (i, nie ky) 3 
If, in this definition, we replace condition 2 by 
2’. Ke (ay) a(x, wats come g 
then the function f is said to be strongly representable in K. Notice 


that 2’ implies 2, by Gen and rule A4. Hence, strong representability 
implies representability. The converse is also true, as we now prove. 


— 


Proposition 3.12 (V.H. Dyson) 


If f(x, ..., X,) is representable in K, then it is strongly representable in K. 


Proof 


Assume f representable in K by a wf .4(x,, ..., X,, y). Let us show that f is 
strongly representable in K by the following wf 7 (x, ..., X,, y): 


([(ay) B (Siycnp tary) |A B (x1, %n/9))¥(=[(y) 4 (Xp eer Xa YAY =0) 
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1. Assume f(k,..., k,) = m. Then tx 4(ki,..., kn) and tk (iy) 
B (i, an oa y). So, by conjunction introduction and disjunction 
introduction, we get kk 7 (ki, ..., ky,i). 

2’. We must show F,(A,Y) 7 (<1, «- Xp Y)- 


Case 1. Take (A,y).4 (x1, ..., X, Y) as hypothesis. (i) It is easy, using rule C, 
to obtain .4(%,, ..., X,, b) from our hypothesis, where b is a new individual 
constant. Together with our hypothesis and conjunction and disjunction 
introduction, this yields 7 (x,, ..., x,, b) and then, by rule E4, (Ay)7 (x, ..., 
X,Y). (ii) Assume 7 (X1, «2, XU) A 7 (Xy, 0, XV). From 7 (x1, ..., X,, u) and 
our hypothesis, we obtain (1, ..., X,, u), and, from 7 (x1, ..., X,, ¥) and our 
hypothesis, we obtain .7(x,, ..., X,, v). Now, from .4(%1, ..., X,, u) and .A(X1, ..., 
X,, v) and our hypothesis, we get u = v. The deduction theorem yields 7 (x,, 

wp Xp UA C(X, 02, Xp_y V) > U = V. From (i) and (ii), (Ayy) 7 (%, .-., X,, y). Thus, 
we have proved Fx(Ayy).A(Xp «06, Xa Y) > (AY) 7 (Xp, 0 Xu Y)- 


Case 2. Take (Ay). 7(X1, «.., X» Y) as hypothesis. (i) Our hypothesis, together 
with the theorem 0 = 0, yields, by conjunction introduction, 7(4,y).7(X1, «.., Xv 
y) A, 0 = 0. By disjunction introduction, 7 (x, ..., X,, 0), and, by rule E4, (Ay) 
C(Xy 7 Xv Y). (ii) Assume 7 (X1, 00. Xp UA (Xp oe Xp, V). From 7 (X1, 2.) Xp U) 
and our hypothesis, it follows easily that u = 0. Likewise, from 7 (x1, ..., X,, 0) 
and our hypothesis, v = 0. Hence, u = v. By the deduction theorem,  (%1, ..., Xv 
U) A 7 (Xy, --+,X,y V) > U = v. From (i) and (ii), (A,y) 7 (1, ...,X,, y). Thus we have 
proved Fy AA) Ay 0 Xu Y) > AY) Sy -) Xa Y)- 

By case 1 and case 2 and an instance of the tautology [((D > E) \@D=> E)|> 
E, we can obtain Fy (Ay) (1, «+, Xv Y)- 

Since we have proved them to be equivalent, from now on we shall use 
representability and strong representability interchangeably. 

Observe that a function representable in K is representable in any exten- 
sion of K. 


Examples 


In these examples, let K be any theory with equality in the language . 


1. The zero function, Z(x) = 0, is representable in K by the wf x, = x, A 
y = 0. For any k and m, if Z(k) = m, then m = 0 and | Kk =k a0=0; 
that is, condition 1 holds. Also, it is easy to show that F,(Ayy)(X, = x; 
A y = 0). Thus, condition 2’ holds. 

2. The successor function, N(x) = x + 1, is representable in K by the 
wf y = x,’. For any k and m, if N(k) =m, then m =k + 1; hence, im is k’. 
Then k m=k . It is easy to verify that F,(A,y)(y = x’). 

3. The projection function, U/' (x1, ..., X,)=%;, is representable in K 
by x) =X; AX =XQA AX, =X, AY = x; If Ui (ki, ‘eh = 10, 
then m = k;. Hence, Fx ki =k AkgH=hk an... Ak, =k, Am=k;. Thus, 
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condition 1 holds. Also, Fx (Ay) =X AX. = XA. AX =X, AY =X), 
that is, condition 2’ holds. 

4. Assume that the functions g(X1, ..-, Xu), Ay (yy oe Xp over U(X «er Xp) 
are strongly representable in the theory with equality K by the 
wis ¢ (Xp eer Xin Zp AX oy Xa Vay oor Bu(Xy 4 Xr Wn), LeSpectively. 
Define a new function f by the equation 


fixe sak Xn) =g(In (x, sag Xn), pate Hin (x1, shy Xn)) 


fis said to be obtained from g, hy, ...,h,, by substitution. Then f is also strongly 
representable in K by the following wf 7(X,, ..., Xj 2): 


(Sy1) ... (3ym)( Bip sche i) As Ete ok IN ALY Gc Ymr2)) 


To prove condition 1, let f(k, ...,k,) =p. Let h(k, ...,k,) =1;for 1 <j < ™, then. Wry 
soy Vy) = Pp. Since 7, A, «4, Ay, represent g, hy, ..., Nyy, we have kk 4 (ki, way ku/¥) 
for 1<j<m and tx+(i,...,%m,P). So by conjunction introduction, 


Hk AA k, oy Ruy RA Seok Pm (ka, < KnsTm)A Z (7 oe Tm P)« Hence, by rule E4, 
bk G(ky, ..., kn). Thus, condition 1 holds. Now we shall prove condition 2’. 
Assume A(Xy, 2.) Xy U) A A(X, «++, Xp, V), that is 


(A) (241)... (Bym)(A(1--, Ae on cee aa Ymrtt)) 


( )(3y1) .-- (Bm) (Ala, nap har Ya) A ice h Be Gi seek Sa Oe (Yaioes Ym) 
By (A), using rule C m times, 


Bi | Rigs ticy Xp Or) Aoi, A Bea Mas sss Cie AS (Ory vig Oued) 


By (0) using rule C again, 
A(x, sce Xn/C1)A ABn (x1, shee nfl IA Z (c., a CmiP) 
Since Fx (A,y). Ay ---, Xw y;), we obtain from .4(%, ..., X,,b) and Bx, «.., Xp C)), 


that b; = c;. From 7 (by, ..., By, uw) and by = Cy, ..-, Dy = Cy, WE have (Cy, «+4, Cry U)- 
This, with Fx (4,2)7 (Xp «..,X,, Z) and 7 (Cy, «.., Cy, Vv) yields u = v. Thus, we have 
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shown kg J (Xp 0. Xy WA F(X, «4 Xj, 0) > U = v. It is easy to show that Fx (42) 
D (X4, +++) Xp Z). Hence, Fy (442) F(X, «+ Xp Z)- 


Ww 


Exercises 


3.11 Let K be a theory with equality in the language »,. Show that the fol- 
lowing functions are representable in K. 
a. ZX, -.-, X,) =0 [ Hint DLn(X1, 0 Xn) = Z(UT(%1, «.., x,)).| 
b. Ce (x1, Soy Ki aky where k is a fixed natural number. [Hint: Use 
mathematical induction in the metalanguage with respect to k.] 
3.12 Prove that addition and multiplication are representable in S. 


If R is a relation of n arguments, then the characteristic function C, 
is defined as follows: 


C ) O if R(x1,..., X,) is true 
Xj oc KH ; : 
Rt 1 if R(x, ..., X,) is false 


Proposition 3.13 


Let K be a theory with equality in the language , such that kx 04 1. Thena 
number-theoretic relation R is expressible in K if and only if Cp is represent- 
able in K. 


Proof 


If R is expressible in K by a wf .7 (x, ..., X,), it is easy to verify that Cp is rep- 
resentable in K by the wf ( A(X, eer MnJAY =0)v(= A Lips Ay AY= HL): 
Conversely, if C, is representable in K by a wf 7 (x, ..., x,, y), then, using the 
assumption that kx 0 # 1, we can easily show that R is expressible in K by the 
we 7 (x1, --+, Xp 0). 


Exercises 


3.13 The graph of a function f(x,, ..., x,) is the relation f(x,, ..., X,) = Xn. Show 
that f(x, ...,x,,) is representable in S if and only if its graph is expressible 
in S. 

3.14 If Q and R are relations of n arguments, prove that C,o-p = 1 - Cr, 
C @orr) = Cg* Cr and C @anary = Cg t Cr - Cg: Cp. 

3.15 Show that f(x, ..., x,) is representable in a theory with equality K in the 
language , if and only if there is a wf .4(%,, ..., X,, y) such that, for any 
Ky phy , if fly «.., k,) =m, then hx (Wy) (ki, ky) y =m). 
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3.3 Primitive Recursive and Recursive Functions 


The study of representability of functions in S leads to a class of number- 
theoretic functions that turn out to be of great importance in mathematical 


logic and computer science. 


Definition 


1. The following functions are called initial functions. 
I. The zero function, Z(x) = 0 for all x. 
II. The successor function, N(x) =x + 1 for all x. 


III. The projection functions, U}' (x1, ..., X,) =X; for all x, ..., x 


n 


2. The following are rules for obtaining new functions from given 
functions. 


IV. Substitution: 


f (au Saag xn) =9(In(m, vee tN 4c) Hin (x1, as Xn) 


fis said to be obtained by substitution from the functions 


g(y1, seey Ym) Ma (x1, seey xi); see Hin (x1, seey ,) 


V. Recursion: 


VI. 


TAS ae Xn/0)=9(%1, sawp <n) 
Pipa )= h(x, hee Ma Ed | RRS Xn/¥)) 


Here, we allow n = 0, in which case we have 


f(0)=k where k is a fixed natural number 
fy+D=h(y, fly)) 


We shall say that f is obtained from g and h (or, in the case n = 0, 
from h alone) by recursion. The parameters of the recursion are 
X41, ..., X,. Notice that fis well defined: f(x, ..., x, 0) is given by the 
first equation, and if we already know f(x, ..., x, y), then we can 
obtain f(x, ..., X,, y + 1) by the second equation. 

Restricted \-Operator. Assume that 9(Xx,, ..., X,, y) is a func- 
tion such that for any x, ..., x, there is at least one y such that 
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GX1, + Xn, Y) = 0. We denote by py(g(X1, .... X, Y) = 0) the least 
number y such that g(x, ..., X,, y) = 0. In general, for any rela- 
tion R(x, ..., X, y), we denote by pyR(x1, ..., X,, y) the least y 
such that R(x, ..., x, y) is true, if there is any y at all such that 
RQ, - Xv Y) holds. Let f(x, ..., %,) = by(G(ty 4 Xv Y) = 0). 
Then f is said to be obtained from g by means of the restricted 
-operator if the given assumption about g holds, namely, for 
any x1, ...,X,, there is at least one y such that 9(%1, ...,X,, y) = 0. 

3. A function f is said to be primitive recursive if and only if it can be 
obtained from the initial functions by any finite number of substitu- 
tions (IV) and recursions (V)—that is, if there is a finite sequence 
of functions fo, ..., f, such that f, =f and, for 0 <i <n, either f; is an 
initial function or f; comes from preceding functions in the sequence 
by an application of rule (IV) or rule (V). 

4. A function f is said to be recursive if and only if it can be obtained from 
the initial functions by any finite number of applications of substitution 
(IV), recursion (V) and the restricted p-operator (VI). This differs from 
the definition above of primitive recursive functions only in the addi- 
tion of possible applications of the restricted 1-operator. Hence, every 
primitive recursive function is recursive. We shall see later that the con- 
verse is false. 


We shall show that the class of recursive functions is identical with the class 
of functions representable in S. (In the literature, the phrase “general recur- 
sive” is sometimes used instead of “recursive.”) 

First, let us prove that we can add “dummy variables” to and also per- 
mute and identify variables in any primitive recursive or recursive function, 
obtaining a function of the same type. 


Proposition 3.14 


Let o(y,, ..., y) be primitive recursive (or recursive). Let x, ... x,, be distinct 
variables and, for 1 <i <k, let z; be one of x,, ..., x,. Then the function f such 
that f(x, ..., X,) = 9Z1, .-., Z,) is primitive recursive (or recursive). 


Proof 


Let z; = x;, where 1 <j, <n. Then z; = Uj;(x1, ..., X,). Thus, 


f (as ae Xn)=g(U, (x1, boty Xn), recs (x1, ws Xn)) 


and therefore f is primitive recursive (or recursive), since it arises from 
g,U;,, ..., Uj, by substitution. 
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Examples 


1. Adding dummy variables. If g(x, x3) is primitive recursive and if f(x, Xo, 
Xs) = (x1, x3), then f(x,, Xz, X3) is also primitive recursive. In Proposition 
3.14, let z, = x, and z, = x3. The new variable x, is called a “dummy 
variable” since its value has no influence on the value of f(x,, Xz, x3). 


2. Permuting variables. If g(x, x2, x3) is primitive recursive and if f(x,, 
Xy X3) = G(X3, Xy, Xp), then f(x, x2, X3) is also primitive recursive. In 
Proposition 3.14, let z, = 3, Z) = X,, and Z3 = X>. 

3. Identifying variables. If g(x1, X», x3) is primitive recursive and if f(x,, x2) = 
G(X1, Xp, X1), then f(x1, x2) is primitive recursive. In Proposition 3.14, let 
n=2and Z, =X), Z) =x, and Z; = x}. 


Corollary 3.15 


a. The zero function Z,(x,, ..., x,) = 0 is primitive recursive. 

b. The constant function Ci (%1, ..., X:) =k, where k is some fixed nat- 
ural number, is primitive recursive. 

c. The substitution rule ([V) can be extended to the case where each 
h,; may be a function of some but not necessarily all of the variables. 
Likewise, in the recursion rule (V), the function g may not involve 
all of the variables x, ...,x,, y, or f(x, ...,X,, y) and h may not involve all 
of the variables x,, ..., X,, Y, OF fy, + Xy Y): 


Proof 


a. In Proposition 3.14, let g be the zero function Z; then k = 1. Take 
z, to be x;. 

b. Use mathematical induction. For k = 0, this is part (a). Assume C; 
primitive recursive. Then Cé.1(%1,..., X,) is primitive recursive by 
the substitution Cf (%1, -.., Xn) = N(CE Cee a 

c. By Proposition 3.14, any variables among x, ..., xX, not 
present in a function can be added as dummy _vari- 
ables. For example, if h(x, x3) is primitive recursive, then 
h*(2x1,%2,%3)=h(x1,%3)= h(U (x1,%2,%3),U3 (x1,%2,%3)) is also 
primitive recursive, since it is obtained by a substitution. 


Proposition 3.16 


The following functions are primitive recursive. 
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-1 ifx>0 
X= {5 if x 


0 if x=0 
8 is called the predecessor function. 
x-y ifx2y 
x+y= : 
0 ifx<y 


_|eay ifeey 
syle{o ifx<y 


0 ifx=0 
if x #0 


if x=0 


eds oe id 
s800)={5 ifx 40 


x! 
min (x, y) = minimum of x and y 
min (X1, ..., X,) 


7 nL 


. max (x, y) = maximum of x and y 
. max (X1,..., X,) 


7 nL 


rm (x, y) = remainder upon division of y by x 


. qt (x, y) = quotient upon division of y by x 


. Recursion rule (V) 


x+0=x or f(x,0)=Ui(x) 

xt+(ytI=N(x+y) f(a yt)=N(F (x,y) 
x-0=0 or g(x,0)=Z(x) 
x-(ytI=(e-y)+x g(x, ¥ +I) = f(x, y),x) 
where f is the addition function 
x =1 
ae =(x")-2 
5(0) =0 

dy+)=y 

x-0=x 


x+(y+1)=d(x-y) 


. |x-y |e (x-y)+(y—~x) (substitution) 
. $e¢(x)=x-8(x) (substitution) 
. se(x) = 1~se(x) (substitution) 
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i O!=1 
(y+I!=(y))- (y+) 
j. min(x,y)=x-+(x-y) 
k. Assume min (xy soa %) already shown primitive recursive. 


min (x1, uy irene) =min(min(x1, neste nn | 


lL. max(x,y)=y+(x~y) 


m. max (x1, sua yi tenis) = max(max(x1, ey a) ist 


n. rm(x,0)=0 

rm(x,y +1) = N(rm(x, y))-sg(|x-N(rm(x,y)) |) 
o. qt(x,0)= 

qt(x, y+ 1) =qt(x, y) +sg(|x-N(rm(x,y))) 


In justification of (n) and (0), note that, if q and r denote the quotient qt(x, y) 
and remainder rm(x, y) upon division of y by x, then y = qx + randO<r<vx. 
So,y+1l=qxt+(r+1).lfr+1 <x (that is, if |x -—N@m(x, y))|> 0), then the quo- 
tient qt(x, y + 1) and remainder rm(x, y + 1) upon division of y + 1 by x are q 
and r + 1, respectively. If r + 1 = x (that is, if |x —-N(@rm(x, y))| = 0), theny + 1= 
(q + Ix, and qt(x, y + 1) and rm(x, y + 1) are g + 1 and 0, respectively** 


Definitions 


> ( )= Oif z= 0 
F(R ey Xu F(X, 20, Xn O) te + f(X1, 00, Xn, Z—-1) if z>0 


Y<z 
> feu ,Xn/Y)= oo ec Xn,Y) 
YSZ y<z+1 


7 lif z=0 
[ [fee D9 py) oe finer tyre) if 250 


Y<z 
[ [rev ,Xn/Y)= =| f@y, Xn) 
YSZ y<z+1 


* Since one cannot divide by 0, the values of rm(0, y) and qt(0, y) have no intuitive significance. 
It can be easily shown by induction that the given definitions yield rm(0, y) = y and qt(Q, y) = 0. 
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These bounded sums and products are functions of x,, ..., X,, z. We can also 


Ww 


define doubly bounded sums and products in terms of the ones already 
given; for example, 


> Ff (Xt, er Xn Y) =f (M1, oy Xn UF DY $+ f (M1, Xn,O-1) 


u<y<v 


= > Ff (1, 00, Xney tut) 


y<6(v-u) 


Proposition 3.17 


If f(xy, ..., X, Y) is primitive recursive (or recursive), then all the bounded 
sums and products defined above are also primitive recursive (or recursive). 


Proof 


Let g (x1, rubie pe =>: f( Miya y yey y) ). Then we have the following recursion: 


y<z 


Giri 80) S0 
g(x, an Xn,Z+1) =9(%1, ae XnrZ)+ f (1, sok Rae) 


TEVA ig aicty Hi = i Ge er Xn/y), then 


YSZ 
h( x1, --+1 XurZ) = 9(X1, --+, Xn, Z+1) (substitution) 


The proofs for bounded products and doubly bounded sums and products 
are left as exercises. 


Examples 

Let t(x) be the number of divisors of x, if x > 0, and let t(0) = 1. (Thus, t(x) is 
the number of divisors of x that are less than or equal to x.) Then t is primi- 
tive recursive, since 


u(x) =} 'sa(rm(y, x) 


YSx 


Given expressions for number-theoretic relations, we can apply the connec- 
tives of the propositional calculus to them to obtain new expressions for 
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relations. For example, if R,(x, y) and R,(x, u, v) are relations, then R,(x, y) A 
R,(x, u, 0) is a new relation that holds for x, y, u, v when and only when both 
R(x, y) and R,(x, u, v) hold. We shall use (Wy),.-R(1, .--, X» y) to express the 
relation: for all y, if y is less than z, then R(x,, ..., x,, y) holds. We shall use 
(VY) <x (AY),<, and (Ay),<, in an analogous way; for example, (Ay),<-RQ1, «++, Xn Y) 
means that there is some y < z such that R(x, ..., x,, y) holds. We shall call 
(VY) ycxr (VY) yezr (AY)ycz, and (Ay),<. bounded quantifiers. In addition, we define a 
bounded 1-operator: 


the least y < z for which R(%1, ..., Xn,Y) 
LY yezR(X1, ..., Xn, Y) =) holds if there is such a y 


z otherwise 


The value z is chosen in the second case because it is more convenient in later 
proofs; this choice has no intuitive significance. We also define py,<.R(X1, ..., 
xy y) to be BY yer R(X1, ser Xuy y). 

A relation R(x,, ..., X,) is said to be primitive recursive (or recursive) if and 
only if its characteristic function C,(%1, ..., X,) is primitive recursive (or recur- 
sive). In particular, a set A of natural numbers is primitive recursive (or 
recursive) if and only if its characteristic function C,(x) is primitive recursive 
(or recursive). 


Examples 


1. The relation x, = x, is primitive recursive. Its characteristic function 
is sg(|x, — x,|), which is primitive recursive, by Proposition 3.16(f, g). 

2. The relation x, < x, is primitive recursive, since its characteristic 
function is sg (x2 = ai); which is primitive recursive, by Proposition 
3.16(e, h). 

3. The relation x, |x, is primitive recursive, since its characteristic func- 
tion is sg(rm(x,, x,)). 

4. The relation Pr(x), x is a prime, is primitive recursive, since Cp,(x) = 
sg(|t(x) — 2|). Note that an integer is a prime if and only if it has 
exactly two divisors; recall that (0) = 1. 


Proposition 3.18 


Relations obtained from primitive recursive (or recursive) relations by means 
of the propositional connectives and the bounded quantifiers are also primi- 
tive recursive (or recursive). Also, applications of the bounded !-operators 
HY,<; and py,.. lead from primitive recursive (or recursive) relations to primi- 
tive recursive (or recursive) functions. 


Formal Number Theory 181 


Proof 


Assume R,(x,, ...,X,,) and R,(x,, ...,X,) are primitive recursive (or recursive) rela- 
tions. Then the characteristic functions Cr, and Cp, are primitive recursive (or 
recursive). But C_a, (1, ..., Xn) =1+Cr,(%1, ..., Xn); hence =R, is primitive recur- 
sive (or recursive). Also, Crave (%1, .--, Xn) = Cr, (X1, ---, Xn) Cry (Xr, 0, Xn)s 
so, R, V R, is primitive recursive (or recursive). Since all propositional connec- 
tives are definable in terms of = and V, this takes care of them. Now, assume 
R(x, ..., X» Y) is primitive recursive (or recursive). If Q(x, ..., X,, Z) is the rela- 
tion (3y),<-R(, ...,X;, y), then it is easy to verify that Co(x1, ..-, XZ) = yccCr%, 
.. XY), which, by Proposition 3.17, is primitive recursive (or recursive). The 
bounded quantifier (Ay),., is equivalent to (Ay),<..;, which is obtainable from 
(2y),<, by substitution. Also, (Vy),.. is equivalent to ~(4y),..7, and (Vy),<, is 
equivalent to >(4y),<.7. Doubly bounded quantifiers, such as (AY),,cycw can be 
defined by substitution, using the bounded quantifiers already mentioned. 
Finally, I1,<,Ca(1, ---, X;, u) has the value 1 for all y such that R(x;, ..., x, u) is 
false for all u < y; it has the value 0 as soon as there is some u < y such that 
R(X, ..-, X» U) holds. Hence, » (TcyCr(%1, -.., Xn,U)) counts the number of 
integers from 0 up to but not including the first y < z such that R(x, ..., X,, y) 
holds and is z if there is no such y; thus, it is equal to py,..RQ, -.., X,Y) and 
so the latter function is primitive recursive (or recursive) by Proposition 3.17. 


nw 


Examples 


1. Let p(x) be the x,, prime number in ascending order. Thus, p(0) = 2, 
p(l) = 3, p(2) = 5, and so on. We shall write p, instead of p(x). Then p, 
is a primitive recursive function. In fact, 


Po =2 
Pru = LY y<(py)41(Px <y A Pr(y)) 


Notice that the relation u < y A Pr(y) is primitive recursive. Hence, 
by Proposition 3.18, the function py,.(u < y A Pr(y)) is a primitive 
recursive function g(u, v). If we substitute the primitive recursive 
functions z and z! + 1 for u and v, respectively, in g(u, v), we obtain 
the primitive recursive function 


NZ) = WY yszisi(Z < yA Pr(y)) 


and the right-hand side of the second equation above is h(p,); hence, 
we have an application of the recursion rule (V). The bound (p,)! + 1 
on the first prime after p, is obtained from Euclid’s proof of the 
infinitude of primes (see Exercise 3.23). 

2. Every positive integer x has a unique factorization into prime pow- 


ers: X =p’ pi ---p,. Let us denote by (x); the exponent a; in this 
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factorization. If x = 1, (x); = 1 for all j. If x = 0, we arbitrarily let 


(x); = 0 for all j. Then the function (x); is primitive recursive, since 
yt 


(2); = BY yex (Pf [x—=(p}" |x). 

3. For x > 0, let f(x) be the number of nonzero exponents in the fac- 
torization of x into powers of primes, or, equivalently, the number 
of distinct primes that divide x. Let €#(0) = 0. Then fh is primitive 
recursive. To see this, let R(x, y) be the primitive recursive rela- 


tion Pr(y) A y|x A x #0. Then ¢h(x) = > : sg(Cr (x,y). Note that 


this yields the special cases ff(0) = &h(1) = 0. The expression “thi(x)” 
should be read “length of x.” 

4. If the number x = 23"... pi‘ is used to “represent” or “encode” 
the sequence of positive integers do, a, ..., a, and y=2"3"... pim 
“represents” the sequence of positive integers Dy, b,, ..., b,,, then the 
number 


— 9% 2% A yyb0 yh b, 
KY = 203"... De PesaPev2 «++ Pevlem 


“represents” the new sequence dy, 4, ..., Ay Uo, by, ..., b,, obtained by 
juxtaposing the two sequences. Note that ff(x) = k + 1, which is the 
length of the first sequence, f(y) = m + 1, which is the length of the 
second sequence, and b; = (y); Hence, 


(yj 
x*Y =X XxX (pencxsj) 


jsth(y) 


and, thus, + is a primitive recursive function, called the juxtaposition 
function. It is not difficult to show that x + (y * z) = (x * y) +z as long 
as y # 0 (which will be the only case of interest to us). Therefore, 
there is no harm in omitting parentheses when writing two or more 
applications of ». Also observe that x+0=x+1=x. 


Exercises 


3.16 Assume that R(x, ..., X,, y) is a primitive recursive (or recursive) rela- 
tion. Prove the following: 


a. (AW yeyeoR Or, sey Xw y), (AY) yeyeoR1, ad A Xw y), and (AY) veycoR (x1, Ss eg: Xv 
y) are primitive (or recursive) relations. 


b. BY nyeoR(%1, Stay, Xw y), BY ncycoR(X1, er Xw y), and PY ncycoR (x, a7: Xv y) 
are primitive recursive (or recursive) functions. 

c. If, for all natural numbers x,, ..., x,, there exists a natural number y 
such that R(x, ..., X,, y), then the function f(x, ...,%,) = wYR(&y -- YY) 
is recursive. [Hint: Apply the restricted p)-operator to Cav, ..., Xv Y)-] 
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3.17 


3.18 


3.19 


3.20 


3.21 


3.22 


3.23 


a. Show that the intersection, union and complement of primitive 
recursive (or recursive) sets are also primitive recursive (or recur- 
sive). Recall that a set A of numbers can be thought of as a relation 
with one argument, namely, the relation that is true of a number x 
when and only when x € A. 


b. Show that every finite set is primitive recursive. 

Prove that a function f(x,, ..., x,) is recursive if and only if its represent- 
ing relation f(x, ...,X,) = y is a recursive relation. 

Let [Vn] denote the greatest integer less than or equal to Vn, and let 
II(n) denote the number of primes less than or equal to n. Show that 
[Vn ] and I](n) are primitive recursive. 

Let e be the base of the natural logarithms. Show that [ne], the greatest 
integer less than or equal to ne, is a primitive recursive function. 


Let RP(y, z) hold if and only if y and z are relatively prime, that is, y and 
z have no common factor greater than 1. Let p(n) be the number of posi- 
tive integers less than or equal to n that are relatively prime to n. Prove 
that RP and @ are primitive recursive. 


Show that, in the definition of the primitive recursive functions, one 
need not assume that Z(x) = 0 is one of the initial functions. 


Prove that p,., < (Yop1--. p,) + 1. Conclude that p,,, < p,! + 1. 


For use in the further study of recursive functions, we prove the following 
theorem on definition by cases. 


Proposition 3.19 


Let 


Q(X, --., Xn) if Ry(x1, ..., X, holds 


fle #5 92(X1, -.+, Xn) if Ro(x1, ..., Xn holds 
Tr eees An) = % 


Ok X1, e+, Xn) — if Re(x1, ..., Xn holds 


If the functions g,, ..., g, and the relations R,, ..., R, are primitive recursive 
(or recursive), and if, for any x,, ..., x,, exactly one of the relations R(x, ..., x,), 
... R(x, ...,X,) is true, then f is primitive recursive (or recursive). 


7 n 


Proof 


f(x, sees Xn) =9(%1, sey xn)se(Cr, (x1, sey Xn))+ 9 he 


ge (X1, sees Xn): 8e(CR, (x1, aap Xn)). 
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Exercises 


3.24 Show that in Proposition 3.19 it is not necessary to assume that R, is 
primitive recursive (or recursive). 


3.25 Let 


f(x)= 


x? if xis even 
x+1 if xisodd 


Prove that f is primitive recursive. 
3.26 Let 
h(x) = (i if Goldbach’s conjecture is true 
1 if Goldbach’s conjecture is false 


Is h primitive recursive? 

It is often important to have available a primitive recursive one-one cor- 
respondence between the set of ordered pairs of natural numbers and the set 
of natural numbers. We shall enumerate the pairs as follows: 


(0,0), (0,1),(1,0),(1,1), (0,2),(2,0), (1,2), (2,1),(2,2), 


After we have enumerated all the pairs having components less than or 
equal to k, we then add a new group of all the new pairs having components 
less than or equal to k + 1 in the following order: (0, k + 1), (k + 1, 0), (1, k + 1), 
(k+1,0), ..., (kk +0, (k+14, (k+1,k+ 0). Ifx< y, then (x, y) occurs before (y, x) 
and both are in the (y + 1)th group. (Note that we start from 1 in counting 
groups.) The first y groups contain y” pairs, and (x, y) is the (2x + 1)th pair 
in the (y + 1)th group. Hence, (x, y) is the (y? + 2x + 1)th pair in the ordering, 
and (y, x) is the (y? + 2x + 2)th pair. On the other hand, if x = y, (x, y) is 
the (x + 1)?)th pair. This justifies the following definition, in which o7(x, y) 
denotes the place of the pair (x, y) in the above enumeration, with (0, 0) con- 
sidered to be in the Oth place: 


0° (x,y) = se(x~ y):(x? + 2y +1) +se(x= y)-(y? + 2x) 


Clearly, o? is primitive recursive. 

Let us define inverse functions of and o3 such that o7(o°(x,y)) =x, 
03(07(x, y)) = y and o7(o7(z), 63(z)) =z. Thus, o7(z) and 03(z) are the first and 
second components of the zth ordered pair in the given enumeration. Note 
first that o7(0) = 0,03(0) =0, 


o3(n) if of(n) < o3(n) 
oi(n+1)=403(n)+1 if of(n) > 03(n) 
0 if 02(2) = 03(n) 
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and 


Soi i if o3(11) # 03(n) 


oi(n)+1 — if of(n) = 03(n) 
Hence, 


o7(1 + 1) = 03(1)- (sg(o3(n) = 54(1))) + (63(1) + 1)-(sg(o1(11) = 03(n))) 
= $(o1(11), 03(1)) 

03(n +1) = o1(n)-(sg( | 03(n) — 7(n) | )) + (o1(1) + 1) -(Sg(| o1(n) — 03(n) |) 
= w(oi(1),03(n)) 


where ¢$ and y are primitive recursive functions. Thus, oj and 03 are defined 
recursively at the same time. We can show that oi and 03 are primitive recur- 
sive in the following devious way. Let h(u) = 273%". Now, his primitive 
recursive, since (0) = 273% = 29.39 =1, and h(n+1)=27("' 932" — 
(27 (1,630) woz (7).03(1)) _ DOCALM))o AACA) W(HCADDo LACH) Remembering. that the 
function (x); is primitive recursive (see Example 2 on page 181), we conclude 
by recursion rule (V) that h is primitive recursive. But o7(x) =(h(x))o and 
03(x) = (h(x)). By substitution, oj and of are primitive recursive. 

One-one primitive recursive correspondences between all n-tuples of nat- 
ural numbers and all natural numbers can be defined step-by-step, using 
induction on n. For n = 2, it has already been done. Assume that, for n =k, we 
have primitive recursive functions Of (x1, -.., Xe), O4(X), ..., Of(X) such that 
of(o*(x1,..., X)) =x; for 1<i<k, and o*(of(x),..., of(x)=x). Now, for 
n=k+1, define of (x1, ..., X¢,Xea1) = O7(O* (X41, «02, Xe), Xear), OF (x) = of (o2(x)) 
forl<i<kand of{{(x) = 03(x). Then of*!,of"1, ..., ofij are all primitive recur- 
sive, and we leave it as an exercise to verify that of "(o*"'(x1, ..., Xe) = x; for 
1<i<k+1, and o*(of*"(x), ..., oft}(x)) =x. 

It will be essential in later work to define functions by a recursion in which 
the value of f(x,, ...,x,, y + 1) depends not only upon f(x, ...,x,, y) but also upon 
several or all values of f(x, ...,X,, u) with u < y. This type of recursion is called 


a course-of-values recursion. Let f #(X1, ..., Xn, Y) = pi) Note that 
y 


u< 


fan be obtained from f # as follows: f(x1, ..., Xn/Y)=(f #(X1, ---, Xn/YH1))y- 


Proposition 3.20 (Course-of-Values Recursion) 


If h(x,, ..., X,Y, 2) is primitive recursive (or recursive) and f (x1, ..., X,Y) = A(x, 
wee Xu Y, f#(Xy, «+, Xp Y)), then fis primitive recursive (or recursive). 
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Proof 


fH(%1, «.-, Xn,0)=1 
f#(t, Xn Yt) = f #0, ..., Lope 


= f#(x, toes Xn/Y) a ae a 


Thus, by the recursion rule, f# is primitive recursive (or recursive), and 
Siete Nei VY = ay ie Sa YD) 


Example 


The Fibonacci sequence is defined as follows: f(0) = 1, f(1) = 1, and f(k + 2) = f(k) + 
f(k + 1) for k > 0. Then fis primitive recursive, since 


f(n) =sg(n)+sg(|n-1|)+ (FAO) nr (fA0))n-2) “sg (+1) 
The function 
h(y,z) = sg(y)+sg(|y—1])+ (y+ 2)y:2)-sg(y +1) 
is primitive recursive, and f(n) = h(n, f # (n). 


Exercise 


3.27 Let g(0) = 2, o(1) = 4, and g(k + 2) = 3g(k + 1) -(29(K) + 1). Show that g is 
primitive recursive. 


Corollary 3.21 (Course-of-Values Recursion for Relations) 


If H(x,, ..., X, Y, Z) is a primitive recursive (or recursive) relation and 
R(X1, «.-, X,Y) holds if and only if H(%1, ..., Xn, Y,(Cr)#(%1, «.., Xn, y)), where Cp 
is the characteristic function of R, then R is primitive recursive (or recursive). 


Proof 


CalXy, ee Xv Y) = CH(Xy 00 Xv Y (Cr#Xy --, Xp Y)). Since Cy is primitive 
recursive (or recursive), Proposition 3.20 implies that Cz is primitive 
recursive (or recursive) and, therefore, so is R. 

Proposition 3.20 and Corollary 3.21 will be drawn upon heavily in what 
follows. They are applicable whenever the value of a function or relation for y 
is defined in terms of values for arguments less than y (by means of a primi- 
tive recursive or recursive function or relation). Notice in this connection 
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that R(x, ...,X,, u) is equivalent to C(x, ..., X,, uw) = 0, which, in turn, for u < y, 
is equivalent to (Cp)#(%1, «.-, Xp Y)), = 0. 


Exercises 


3.28 Prove that the set of recursive functions is denumerable. 

3.29 If fo, fv fy --. is an enumeration of all primitive recursive functions (or 
all recursive functions) of one variable, prove that the function f,(y) is 
not primitive recursive (or recursive). 


Lemma 3.22 (Godel’s §-Function) 


Let B(x, X, X3) = rm(1 + (x3 + 1) - x, x). Then B is primitive recursive, by 
Proposition 3.16(n). Also, B is strongly representable in S by the following wf 
Bt(X1, Xp, X3, Y): 


(Aw) (x, = (1+ (x3 +1)-%2)-w+y any <14(x3 4+1)- Xx) 


Proof 


By Proposition 3.11 Fg (A,y)Bt(x,, X2, Xz y). Assume B(k, k,, k3) = m. 
Then k, = (1 + (k, + 1) - k,) -k + m for some k, and m <1 + (kh + 1): ky. 
So, Kh= (i +(ks + 1) : ka): +m, by Proposition 3.6(a). Moreover, 


Ls m<1+ (ks Ea 1) .k, by the expressibility of < and Proposition 3.6(a). Hence, 


bs ky = (1 +(k + 1)-k)-k +MamM< 1+(k +1)-k, from which by rule F4, 
bs Bt(ki, ko, ks, im). Thus, Bt strongly represents f in S. 


Lemma 3.23 


For any sequence of natural numbers ky, k, ..., k,, there exist natural numbers 
b and c such that B(b, c, i) =k; for0 <i<n. 


Proof 


Let j = max(n, ko, ky, ..., k,) and let c = j!. Consider the numbers u; = 1 + @ + Ic 
for 0 <i <n; no two of them have a factor in common other than 1. In fact, 
if p were a prime dividing both 1+ @+ Dc and1+(m+1)cwithO <i<m<n, 
then p would divide their difference (m — i)c. Now, p does not divide c, since, 
in that case p would divide both (@ + 1)c and 1 + @ + 1c, and so would divide 1, 
which is impossible. Hence, p also does not divide (m — 1); form -—i<n<j 
and so, m — i divides j! = c. If p divided m — i, then p would divide c. 
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Therefore, p does not divide (m — i)c, which yields a contradiction. Thus, 
the numbers u;, 0 <i <n, are relatively prime in pairs. Also, for0 <i<n,k;<j< 
fi=e<14+@+ Vc =u; that is, k; < u;. Now, by the Chinese remainder theorem 
(see Exercise 3.30), there is a number b < upu, ... u,, such that rm(u, b) = k; for 
0 <i<n. But B(,c,i1)=rm(1 + (@ + Ic, b) = rmlu, b) = k,. 

Lemmas 3.22 and 3.23 enable us to express within S assertions about finite 
sequences of natural numbers, and this ability is crucial in part of the proof 
of the following fundamental theorem. 


Proposition 3.24 


Every recursive function is representable in S. 


Proof 


The initial functions Z, N, and U;' are representable in S, by Examples 1-3 on 
page 171. The substitution rule (IV) does not lead out of the class of repre- 
sentable functions, by Example 4 on page 172. 

For the recursion rule (V), assume that g(x,, ..., x,) and h(x, ..., X,, y, Z) are 
representable in S by wfs (x1, ..., Xj) and 7 (X1, ..., X43), respectively, and let 


L f (Linacre) =o asus te) 
Fes Xny+1)=h(m, SOT Vibe Xn/¥)) 


Now, f(xy... XY) =z if and only if there is a finite sequence of numbers 
bo, ..., b, such that by = (x1, «+, X 1), Dou = Ay, «1 Xp, W, by) forw+1<y,and 
b, =z. But, by Lemma 3.23, reference to finite sequences can be paraphrased 
in terms of the function 6 and, by Lemma 3.22, B is representable in S by the 
wef Bt(x1, Xo, Xz, y). 

We shall show that f(x, ..., X,, Xn41) is representable in S by the following 
we A(xy ---, Xnaa)! 


(du)(S0)[((Aw)(Bt(u,v,0, w) A B(x, ed Xn,W))) ABU, 0, Xp Xns2) 
ACV W)(W < Xn => (Ay)(Az)(Bt(u, v,w,y) A Bt(u, 0, W"', Z) A 7 (X1, «2, Xn, W,Y,Z)))] 


i. First, assume that f(x, ..., x, p) = m. We wish to show that 
bs D(k:, slay kn, p,i). If p = 0, then m = g(k,, ..., k,). Consider the 
sequence consisting of m alone. By Lemma 3.23, there exist b and c 
such that B(b, c, 0) = m. Hence, by Lemma 3.22, 


(X) ts Bt(6,c,0,m) 
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Also, since m = g(k,, ...,k,,), we have, 7 (ki, aie Kn m) Hence, by rule E4, 


(XX) by (Aw 


— 


)(Bt(2,c,0,w) B (k, msds k,,w)) 


In addition, since +,w ¢ 0, a tautology and Gen yield 


(XXX) (Vw) (w <0 => (Ay)(3z)(Bt(b,c,w,y)] 


a(Bt(b,c,w',z) 0 Z (k, fan k,,,9,2)))] 


Applying rule F4 to the conjunction of (X), (#X), and (XXX), we obtain 
be I liege we, Ky ,0, im). Now, for p > 0, fky, ..., ns p) is calculated from 
the equations (I) in p + 1 steps. Let r; = f(k, ..., k,, i). For the sequence 
of numbers 1, ..., 7, there are, by Lemma 3.23, numbers b and c such 
that B(G, c, i) = r; for 0 <i < p. Hence, by Lemma 3.22, Fy Bt (b,c, i fi). 
In particular, B(b,c,0) =m =f (ki, ..., kn,0)=9(ki, .--, kn). Therefore, 
ts Bt(0,c,0,%)A 2(ki,..., ku), and, by rule E4, (i) ts; (aw) 
(Bt(b,c,0,w) a a(k, ete) Since Mp = f (Ra, -.-, kn p)=m 
we have (b,c,p)=m. Hence, (ii) | Bt(b,c,p,im). For 
O0<i<p-1,B(b,c,i)=1 = f (ki, ..., kn,i) and B(b,c,i+ 1 == 
f(ki, 20 Kn t+]) = hy, 0, kn, fa, «0, Kn) =h, ..., kn 1%). 
Therefore, bs Bt(b, C, c 7)A Bt(b, G; i i. NC (Keisscegs nh t, ii, Tu). By 

Rule F4, kz (Sy) (z)(Bt(b,c,7 i i,y)ABt(b,c,7 1/2) Ae (ki, ee kus t,y,2) 
So, by Proposition 3.8(b’), (iii) bk (Vw)(w < p > (4y)(3z) 
(Bt(b,c,w,y) a Bt(b,c,w',z) a 7 (k., sett ky,,y,2)))- Then, apply- 
ing rule E4 twice to the conjunction of (i), (ii), and (ii), we obtain 
bs 9 (ki, Aa k,,p,m). Thus, we have verified clause 1 of the definition 
of representability (see page 170). _ 7 

ii. We must show that by (3X42) 7 (k:, en hae tie), The proof is by 
induction on p in the metalanguage. Notice that, by what we have 
proved above, it suffices to prove only uniqueness. The case of p = 0 is 


left as an easy exercise. Assume ts (AiXn+2) 7 (kh, ee ae ). Let «= 
Ok, ..k,), B= fl, ...k,,p), and y =flk, ..k,p +l) =hk, ...,k,,p, B). Then 


(1) rs (ky, seey kn,P,B,¥) 
(2) rS ak, teey kn, @) 


(3) TS ky, seey k,,p,B) 
(4) by Fk, kn p+ 17) 
(5) rs (A Xna2)I (ki, seer tie) 
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Assume 


(6) Y (ki, seey kn p+, tne) 
We must prove X42 = Y. From (6), by rule C, 


a. (3w)(Bt(b,c,0,w) A BR (k, ee k,,w)) 


b. Bt(b,c,p+1,Xn42) 
c. (Vw)(w<p+1> (Ay) (4z)(Bt(b,c,w,y) ABt(b,c,w',z)A 


Z (kh, a k,®,¥,2))) 
From (c), 


d. (Vw)\(w<p> (Ay) (4z)(Bt(b,c,w,y) a Bt(b,c,w',z)A 


Z (ki, co kn,,y,2))} 


From (c) by rule A4 and rule C, 
e. Bt(b,c,p,d) a Bt(b,c,p+1,e)A Z (k, ae kn p,d,e) 


From (a), (d), and €e), 
f. g (ki, -.., kn, pd) 


From (f), (6) and (3), 
8 d=B 

From (e) and (g), 
h. C (hi, -.., kn P,B,e) 


Since B represents h, we obtain from (I) and (h), 
i yY=e 

From (e) and (i), 
j- Bt(b,c,p+1,7) 

From (b), (j), and Lemma 3.22, 
k. Xn+2 = 7 


This completes the induction. 

The p-operator (VI). Let us assume, that, for any x,, ..., x,, there is some y 
such that 9(x,, ..., X,, y) = 0, and let us assume g is representable in S by a wf 
A(X yy oe Xpyn)» Let f(xy... X,) = HY(G(Xy «+, XY) = 0). Then we shall show that f 
is representable in S by the wf .Ax1,..., x 


n+1/* 


B (Sigsass Xn 0)A(Vy)(y < Xn => F(X, «.., Xn/¥,0)) 
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Assume f(k,, ...,k,) =m. Then g(k, ...,k,,m) =0 and, fork < m, glky, .. ky) £0. 
So, ks # (k:, say k,,im,0) and, for k<m,b 7 (k:, ay ky,k,0). By Proposition 
3.8(b", ts (Wy)(y<m=>—w(h,...,kr/y/0)). Hence, ts 7 (ki, +, kn). 
We must also show: ts (4i%ni1).* (k, ee Ke eles) It suffices to prove the 


uniqueness. Assume “ (k, nae Ky u,0)a(vy)(y <u> (k, a ky,y,0)). By 
Proposition 3.7(0'), kkhim<uvim=uvu<m. Since by ¢ (k, bac% k,,im,0), we 


cannot have im<u. Since bz (vy)(y <m=> aH (k., a ky,y,0)), we cannot 
have u < im. Hence, u =m. This shows the uniqueness. 
Thus, we have proved that all recursive functions are representable in S. 


Corollary 3.25 


Every recursive relation is expressible in S. 


Proof 


Let R(x, ..., x,) be a recursive relation. Then its characteristic function Cp 
is recursive. By Proposition 3.24, Cp is representable in S and, therefore, by 
Proposition 3.13, R is expressible in S. 


Exercises 


3.304 a. Show that, if a and b are relatively prime natural numbers, then 
there is a natural number c such that ac = 1(mod b). (Two numbers 
a and Dare said to be relatively prime if their greatest common divi- 
sor is 1. In general, x = y(mod z) means that x and y leave the same 
remainder upon division by z or, equivalently, that x — y is divisible 
by z. This exercise amounts to showing that there exist integers u 
and v such that 1 = au + bv.) 


b. Prove the Chinese remainder theorem: if x,, ..., x are relatively 
prime in pairs and y,, ..., y, are any natural numbers, there is a nat- 
ural number z such that z = y,(mod x,), ...,z = y,(mod x,). Moreover, 
any two such 2z’s differ by a multiple of x, ... x,. [Hint: Let x = x, ... x; 
and let x = W4xX1 = WX) =-+- = W,XxX;. Then, for 1 <j <k, w; is relatively 
prime to x; and so, by (a), there is some z; such that w,z; = 1(mod x). 
Now let Z = wWiZ1y1 + W2Z2Yo +--+ + WeZeYp. Then z = wZy; = y(mod x). 
In addition, the difference between any two such solutions is divis- 
ible by each of x1, ..., x, and hence by x, ... x] 


3.31 Calla relation R(x, ..., x,) arithmetical if it is the interpretation of some 
wf .A(x1, ..., X,) in the language ~, of arithmetic with respect to the 
standard model. Show that every recursive relation is arithmetical. 
[Hint: Use Corollary 3.25.] 
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3.4 Arithmetization: Godel Numbers 


For an arbitrary first-order theory K, we correlate with each symbol u of K an 
odd positive integer g(u), called the Gédel number of u, in the following manner: 


WO =3, 9))=5, 90)=7, g-)=9, g=)=11, 9(V)=13, 
g(X,) =134+8k fork>1 
g(a.) =7+8k fork>1 
g( fic) =1+8(2"3") fork, n>1 


(Al) =3+8(2"3*) fork,n>1 


Clearly, every Gédel number of a symbol is an odd positive integer. Moreover, 
when divided by 8, g(u) leaves a remainder of 5 when u is a variable, a remain- 
der of 7 when wu is an individual constant, a remainder of 1 when u is a func- 
tion letter, and a remainder of 3 when u is a predicate letter. Thus, different 
symbols have different Gddel numbers. 


Examples 
g (x2) = 29, g (a4) = 39, a( fr) =97, g( Az) = 147 


Given an expression Upl; ... u,, where each u; is a symbol of K, we define its 
Gédel number g(upu, ... u,) by the equation 


g{Uottr Beds Uy ) = 2.8(40)3 (1) d, pi) 


where p; denotes the jth prime number and we assume that py = 2. For 
example, 


af A(x, x)) — 28413 8591 )7804 1822)4 380) 
= 9933521774 4294.35 


Observe that different expressions have different Gédel numbers, by virtue 
of the uniqueness of the factorization of integers into primes. In addition, 
expressions have different Gédel numbers from symbols, since the former 
have even Godel numbers and the latter odd Gédel numbers. Notice also 
that a single symbol, considered as an expression, has a different Gédel 
number from its Gddel number as a symbol. For example, the symbol x, has 
Gédel number 21, whereas the expression that consists of only the symbol x, 
has Gédel number 271. 
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If eo, ey, ..., €, is any finite sequence of expressions of K, we can assign a 
Gédel number to this sequence by setting 


G (Corer, -nny Cr) = 28IZ8) |, psler 


Different sequences of expressions have different Géddel numbers. Since a 
Gédel number of a sequence of expressions is even and the exponent of 2 in 
its prime power factorization is also even, it differs from Gédel numbers of 
symbols and expressions. Remember that a proof in K is a certain kind of 
finite sequence of expressions and, therefore, has a Gddel number. 

Thus, g is a one-one function from the set of symbols of K, expressions of K, and 
finite sequences of expressions of K, into the set of positive integers. The range of 
g is not the whole set of positive integers. For example, 10 is not a Gddel number. 


Exercises 


3.32 Determine the objects that have the following Gédel numbers. 
a. 1944 b.49 215 d.13824 e, 291315? 

3.33 Show that, if 1 is odd, 4n is not a G6del number. 

3.34 Find the Gédel numbers of the following expressions. 


a. film) b. ((Wx3)(AAi(a1,%3))) 


This method of associating numbers with symbols, expressions and 
sequences of expressions was originally devised by Gédel (1931) in order to 
arithmetize metamathematics,* that is, to replace assertions about a formal 
system by equivalent number-theoretic statements and then to express these 
statements within the formal system itself. This idea turned out to be the key 
to many significant problems in mathematical logic. 

The assignment of Gddel numbers given here is in no way unique. Other meth- 
ods are found in Kleene (1952, Chapter X) and in Smullyan (1961, Chapter 1, § 6). 


Definition 


A theory K is said to have a primitive recursive vocabulary (or a recursive vocabu- 
lary) if the following properties are primitive recursive (or recursive): 


a. IC(x): x is the Gddel number of an individual constant of K; 
b. FL(x): x is the Gédel number of a function letter of K; 
c. PL(x): x is the Gddel number of a predicate letter of K. 


* An arithmetization of a theory K is a one-one function g from the set of symbols of K, expres- 
sions of K and finite sequences of expressions of K into the set of positive integers. The fol- 
lowing conditions are to be satisfied by the function g: (1) g is effectively computable; (2) there 
is an effective procedure that determines whether any given positive integer m is in the range 
of g and, if m is in the range of g, the procedure finds the object x such that g(x) = m. 
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Remark 

Any theory K that has only a finite number of individual constants, func- 
tion letters, and predicate letters has a primitive recursive vocabulary. For 
example, if the individual constants of K are aj,,4j,, ..., a;,, then IC(x) if and 
only ifx=7+ 8), Vx =7+8j,V... Vx =7 + 8),. In particular, any theory K in 
the language », of arithmetic has a primitive recursive vocabulary. So, S has 
a primitive recursive vocabulary. 


Proposition 3.26 


Let K be a theory with a primitive recursive (or recursive) vocabulary. Then 
the following relations and functions (1-16) are primitive recursive (or recur- 
sive). In each case, we give first the notation and intuitive definition for the 
relation or function, and then an equivalent formula from which its primi- 
tive recursiveness (or recursiveness) can be deduced. 


1. EVbl(x): x is the Gddel number of an expression consisting of a vari- 
able, (4z),..(1 < z A x = 28+), By Proposition 3.18, this is primitive 
recursive. 

EIC(x): x is the Gédel number of an expression consisting of an indi- 
vidual constant, (4y),.,UC(y) A x = 2!) (Proposition 3.18). 

EFL(x): x is the Godel number of an expression consisting of a func- 
tion letter, (Ay),.,(FL(y) A x = 2") (Proposition 3.18). 

EPL(x): x is the G6del number of an expression consisting of a predi- 
cate letter, (Ay),.,(PL(y) A x = 2") (Proposition 3.18). 

2. Argr(x) =(qt(8,x +1))o: If x is the Gddel number of a function letter 
fi, then Arg;(x) = n. Arg;(x) is primitive recursive. 

Arg p(x) = (qt(8,x + 3))o: If x is the Gddel number of a predicate letter 
Aj, then Argp(x) =n. Argp(x) is primitive recursive. 

3. Gd(x): x is the Godel number of an expression of K, EVbl(x) v EIC(x) 
V EFL(x) V EPL) Vx = 22? vx=2Bvx=2VxHePvxa=vr=2by 
(Au) ,<(A0),(x = u + v A Gd(u) A Gd(v)). Use Corollary 3.21. Here, + is 
the juxtaposition function defined in Example 4 on page 182. 

4. MP (x, y, z): The expression with Gédel number z is a direct conse- 
quence of the expressions with Gédel numbers x and y by modus 
ponens, y = 2° *x#2!! *z+#2° ~Gd(x) A Gd(z). 

5. Gen(x, y): The expression with Gédel number y comes from the 
expression with Gédel number x by the generalization rule: 


(30) r<y(EVbI(v) A y = 2° #2? #2 #0 #2? #x#2° A Gd(x)) 


6. Trm(x): x is the G6del number of a term of K. This holds when and 
only when either x is the Gédel number of an expression consisting 
of a variable or an individual constant, or there is a function letter fx’ 
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and terms t,, ..., t, such that x is the Gédel number of fi'(t, ..., tn). 
The latter holds if and only if there is a sequence of n + 1 expressions 


kK Pay Te (tito, tee Via (4, toy tia, fi (ty, toy traytn) 


the last of which, f'(t1,...,t,), has G6édel number x. This 
sequence can be represented by its Gddel number y. Clearly, 
Y <2°3*... py =(2-3- 60. Pn) <(pu!)* < (px!) Note that &h(y) =n +1 
and also that n = Arg;((x)9), since (x), is the Gédel number of f,;’. 
Hence, Trm(x) is equivalent to the following relation: 


EVbl(x) Vv EIC(x) Vv (AY) egy" [x = (Y)encyye1 A 
th(y) = Argr((x)o) + 1A FL(((y)o)o) A (Yoh = 3.4 
eh((y)o) = 2 A (VU) ncon(y)=2(AP) vex (Y) us = (Y)u *U* hg A Trm(v)) A 


(30) vex((Y) ony) = (Yonyy2 *0* 2° ATrm())] 


Thus, Trm(x) is primitive recursive (or recursive) by Corollary 3.21, 
since the formula above involves Trm(v) for only v < x. In fact, if we 
replace both occurrences of Trm(v) in the formula by (2), = 0, then the 
new formula defines a primitive recursive (or recursive) relation H(x, 2), 
and Trm(x) @ H(x, (Cim)*(x). Therefore, Corollary 3.21 is applicable. 

7. Atfml(x): x is the G6del number of an atomic wf of K. This holds if 
and only if there are terms t,, ..., t,, and a predicate letter Aj such 
that x is the Gddel number of Ag (ty, ..., t,). The latter holds if and 
only if there is a sequence of n + 1 expressions: 


Ak Ak (th, Ak (tte, see i (hy, seey tray Ag (th, seey trast) 


the last of which, A(t, ..., tn), has Gédel number x. This sequence 
of expressions can be represented by its Gddel number y. Clearly, 
y < (p,!)* (as in (6) above) and n = Argp((x),). Thus, Atfml(x) is equiva- 
lent to the following: 


(SY) <p, 1" [x = (Y)anmy)=1 A Chty) = Argp((X)o ) +1Aa 
PL(((Y)o)o) A(Cy)o)1 = 3.0 Eh((y)o) = 2 
(VU) ncon(y)=2(AO) vex CY) us = (Y)u *U* 2° A Trm(v)) A 


(B0) vx ((y)anyy-1 = (Yenyy22 *0* 2° A Trm(v))] 


Hence, by Proposition 3.18, Atfml(x) is primitive recursive (or 
recursive) 
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10. 


11. 
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Fml(y): y is the Gddel number of a formula of K: 


Atfml(y) v (42), [(Fml(z) a y = 2° #2? *z*2°)v 
(Fml((z)o) A Fml((z)1)A y = 2° *(z)o #2"! *(z)1 #2°)v 
(Fml((z)o) AEVbI((z)1) A y = 2° # 2° #2" *(z); #2? *(z)o *2°)] 


It is easy to verify that Corollary 3.21 is applicable. 

Subst(x, y, u, v): x is the Gédel number of the result of substituting in 
the expression with Gédel number y the term with Gédel number u 
for all free occurrences of the variable with Godel number v: 


Gd(y) a Trm(u) A EVb1(2’) A[(y = 2° Ax =u) v 
(AW)wey(y =2" Ay #2° Ax=y)vV 
(Az) zy (AW) wey(Fml(w) a y = 2° #2 #2° #2 eweza 


(AO) yex(x = 2? #2 #2° *2° wea A Subst(a,z,u,v)))V 


(AZ) sey (AW) wey (Fml(w) A y = 23 #28 ¥2° x2? *wz))a 
(( y y )) 


(FO) a<x(AB)p<x (AZ) 2<y(1 <ZAY= Zo *ZAX=A* B A 
Subst(a, 2 , u,v) A Subst(B,z,u,2)))] 


Corollary 3.21 is applicable.* The reader should verify that this for- 
mula actually captures the intuitive content of Subst(x, y, u, v). 
SuB(y, u, v): the Godel number of the result of substituting the term 
with Gédel number u for all free occurrences in the expression with 
Gédel number y of the variable with Gédel number v: 


Sub(y,u,v) =u yw Subst(u, y,u,0) 


x 
X<(Puy 


Therefore, Sub is primitive recursive (or recursive) by Proposition 
3.18. (When the conditions on u, v, and y are not met, SuB(y, u, v) is 
defined, but its value is of no interest.) 

Fry, v): y is the Gédel number of a wf or term of K that contains free 
occurrences of the variable with Gédel number v: 


(Fml(y) v Trm(y)) A EVbl(2”) , Subst(y, y,2'°**", 0) 


* Actually a simultaneous recursion in x and y is involved. In the given formula, replace “x” 
by “(q))” and “y” by “(q),” and the result by Corollary 3.21 yields a recursive relation R(q, u, v). 


Now define Subst(x, y, u, v) as R(2*3Y, u, v). 
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(That is, substitution in the wf or term with Gédel number y of a 
certain variable different from the variable with Gédel number v 
for all free occurrences of the variable with Gddel number v yields 
a different expression.) 

12. Ff(u, v, w): u is the Gddel number of a term that is free for the vari- 
able with Gédel number v in the wf with Gédel number w: 


Trm(u) A EVb1(2’) A Fml(w) a [Atfml(w) 
A (AY) y<o(W = 2° #2? * y* 2° A F£(u,0,y)) 
V (AY) y<w (AZ) zw (Ww = 2? *y*#2" #z42° 
A Ff£(u,v,y) A F£(u,0,Z)) Vv 
(Ay) y<w (AZ) z<w(w = 2? #2? #29 #27 #2 ey #2? 
AEVDbI(2*) A(z #0 => Ff (u,v, y) 
A(Fr(u, Zz) => —Fr(y,2))))] 


Use Corollary 3.21 again. 
13. a. Ax,(x): xis the Godel number of an instance of axiom schema (A1): 


(SU) yer (AV) vex (Fm (uv) A Fml(v) 
Ax =2 #442! #23 ey42!! ey 42> 425) 


b. Ax,(x): x is the G6del number of an instance of axiom schema (A2): 


(AU) nex (AO) vex (AW) <x (Fml(u) A Fml(v) A Fm (w) 


Ax=BaVBaus Qa Pays gxe wee Page De Pay 


+ Qi xyx2> x Qi e2? xy2! xws2? 42> *2°) 
c. Ax,(x): x is the Gédel number of an instance of axiom schema (A3): 


(Aut) .<r (40) vex (Fml(u) A Fml(v) 
AX = 29 #23 423 42? ey ¥D> a2 D2 42? ey x2? xD? 4D" 


#29 423 423 42? xy 4242 xy xD? * 2) xy 42> 42°) 
d. Ax,(x): x is the G6del number of an instance of axiom schema (A4): 


(AU) y<x(AV)o<x (AY) yx (Fml(y) A Trm(u) A EVb1(2”) A Ff(u, 0, y) 
Ax = 23 #2? #2? 42 #2? 42°» y*2'! *Sub(y,u,v)* 2°) 
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14. 


15. 


16. 


G is primitive recursive (or recursive). If u is the G6del number of a wf .7 that 
is not a closed wf, then G(u) is the Gédel number of (Vx).%, where x is the free 
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e. Ax;,(x): x is the Godel number of an instance of axiom schema (A5): 


(AU) yex (AV) px (AW) vex (Fml(u) A Fml(w) A EVb1(2°) A =F r(u, 0) 
AX = 22 #23 429 #28 42? 4D DP ays 2x ys2? #2° 


#24 D3 ey xD!) #29 429 *2'5 42? 42> x2? ¥2° 42°) 
f. LAX(y): y is the Gédel number of a logical axiom of K 
Axi(y) v Ax2(y) v Ax3(y) v Axa(y) Vv Axs(y) 


The following negation function is primitive recursive. Neg(x): the 
Gédel number of (=.%) if x is the Gédel number of .% 


Neg(x) = 2° #2? *x*2° 


The following conditional function is primitive recursive. Cond(x, y): 
the Gédel number of (7=> 7) if x is the Gédel number of .7and y is 
the Gédel number of 7: 


Cond(x,y)=2? #x*2" #y#2? 


Clos(u): the Gédel number of the closure of .7if u is the G6del num- 
ber of a wf .v. First, let Vu) = pv, .,(EVbI2”) A Fru, 0). V is primi- 
tive recursive (or recursive). V(u) is the least G6del number of a free 
variable of u (if there are any). Let Sent(u) be Fml() A 740), <,, Fry, v). 
Sent is primitive recursive (or recursive). Sent(u) holds when and 
only when u is the Géddel number of a sentence (i.e., a closed wf). 
Now let 


Clu) = 22222 +2" 42> 4y*2? if Fml(u) \—Sent(u) 
u otherwise 


variable of .7 that has the least G6del number. Otherwise, G(u) = u. Now, let 


H(u,0) = G(u) 
Au, y +1) = Gu, y)) 


H is primitive recursive (or recursive). Finally, 


Clos(u) = H(u, ty y<u(A(u, y) = H(u, y +1) 


Thus, Clos is primitive recursive (or recursive). 
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Proposition 3.27 


Let K be a theory having a primitive recursive (or recursive) vocabulary and 
whose language contains the individual constant 0 and the function letter 
fi of “4. (Thus, all the numerals are terms of K. In particular, K can be S 
itself.) Then the following functions and relation are primitive recursive (or 
recursive). 


17, Num(y): the Gddel number of the expression y 


Num(0) = 2” 
Num(y +1) = 2” +2? *Num(y)*2° 


Num is primitive recursive by virtue of the recursion rule (V). 
18. Nu(x): x is the G6del number of a numeral 


(Ay)y<x(x = Num(y)) 


Nu is primitive recursive by Proposition 3.18. 
19. D(u): the Gédel number of 7 (7), if u is the Godel number of a wf .A(x,): 


D(u) = Sub(u, Num(u), 21) 
Thus, D is primitive recursive (or recursive). D is called the diagonal 
function. 
Definition 


A theory K will be said to have a primitive recursive (or recursive) axiom set if 
the following property PrAx is primitive recursive (or recursive): 


PrAx(y): y is the Godel number of a proper axiom of K 


Notice that S has a primitive recursive axiom set. Let a, d,, ..., dg be the 
Gédel numbers of axioms (S1)-(S8). It is easy to see that a number y is the Gédel 
number of an instance of axiom schema (S9) if and only if 


— 


50) v<y (SW) wey (EVb1(2”) A Fml(w) 

Ay =23 *Sub(w,2!, 0) #2! +23 #23 #2 «2! 22° #25 
+23 «w+2!! «Sub(w, 2" #29 +2” +25, 9)*25 #25 #2" 
#23 429 2! ¥ 2° 42° x we? *2°) 
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Denote the displayed formula by A,(y). Then y is the Gédel number of a 
proper axiom of S if and only if 


Y=HQNVY=QV...VY=Agv Ady) 


Thus, PrAx(y) is primitive recursive for S. 


Proposition 3.28 


Let K be a theory having a primitive recursive (or recursive) vocabulary and 
a primitive recursive (or recursive) axiom set. Then the following three rela- 
tions are primitive recursive (or recursive). 


20. Ax(y): y is the Gédel number of an axiom of K: 


LAX(y) v PrAx(y) 


21. Prf(y): y is the Godel number of a proof in K: 


(AU) u<y (30) o<y(AZ)z<y (AW) wey (Ly = 2° A Ax(w)]v 

[Prf(u) A Fml((u)) A y =u 2” A Gen((u)~,0)]V 

[Prf(u) A Fml((u),) A Fml((u)o) A y = u*2° AMP((u).,(W)w,2)] 
V[Prf(u) A y =u*2° A Ax(v)] 


Apply Corollary 3.21. 
22. Pf(y, x): y is the Gédel number of a proof in K of the wf with Gédel 
number x: 


Prf(y) A x =(Y)inyy-1 


The relations and functions of Propositions 3.26-3.28 should have the sub- 
script “K” attached to the corresponding signs to indicate the dependence 
on K. If we considered a different theory, then we would obtain different 
relations and functions. 


Exercise 


3.35 a. IfK isa theory for which the property Fml(y) is primitive recursive 
(or recursive), prove that K has a primitive recursive (or recursive) 
vocabulary. 


b. Let K bea theory for which the property Ax(y) is primitive recur- 
sive (or recursive). 
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i. Show that K has a primitive recursive (or recursive) vocabulary. 


ii. Assuming also that no proper axiom of K is a logical axiom, 
prove that K has a primitive recursive (or recursive) axiom set. 


Proposition 3.29 


Let K be a theory with equality whose language contains the individual con- 
stant 0 and the function letter ft and such that K has a primitive recursive (or 
recursive) vocabulary and axiom set. Also assume: 


(*) For any natural numbers r and s, if k 7 =5, then r=s. 


Then any function f(x, ..., x, that is representable in K is recursive. 


Proof 


Let Ay Xp Xp41) be a wf of K that represents f Let P (uy, ..., Uy, Uns Y) Mean 
that y is the Gédel number of a proof in K of the wf .4 (i, 2) Way Higa): Note 
that, if P (uy, ..., Uy Un Y), then fl, ..., U,) = Ung (In fact, let fu, ..., u,) = 7. 
Since represents fin K,k 4 (i, ..., Wn, F) and kk (Ay).4 (ih, «.., tn, y). By 
hypothesis, P (Uy, ..., Uy, Ups, Y) Hence, hk 4 (ih, .-., Wn, Un). Since K is a the- 
ory with equality, it follows that kk 7 =Uy41. By (*), f = Uys.) Now let m be the 
Gédel number of .A(%y, «.-, Xv Xna). Then P Uy, ..., Uv Unsy Y) is equivalent to: 


Pf(y,Sub(... Sub(Sub(m, Num(u;),21), Num(u2),29) ... Num(u,.1),21+ 81)) 


So, by Propositions 3.26-3.28, P Uy, ..., Uy Un Y) is primitive recursive (or 
recursive). Now consider any natural numbers k,, ..., k,,. Let f(k, ...,k,) =17. Then 
Kk (k., naaty ker). Let j be the Gédel number of a proof in K of 7 (k, ae ae 
Then P, (ky, ..., kn,1,j). Thus, for any x, ...,X,, there is some y such that P ,(x,, 
.-7 Xn (Y)o, (y),). Then, by Exercise 3.16(c), py(P ,(X1, «--, Xv (Y)o, (y),)) is recursive. 
But, f(xy, ..., X,) = (uy(P ,y ---+ Xu (Yo (y):))o and, therefore, f is recursive. 


Exercise 


3.36 Let K be a theory whose language contains the predicate letter =, the 

individual constant 0, and the function letter fi. 

a. If K satisfies hypothesis (*) of Proposition 3.29, prove that K must be 
consistent. 

b. If K is inconsistent, prove that every number-theoretic function is 
representable in K. 

c. If Kis consistent and the identity relation x = y is expressible in K, 
show that K satisfies hypothesis (*) of Proposition 3.29. 
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Corollary 3.30 


Assume S consistent. Then the class of recursive functions is identical with 
the class of functions representable in S. 


Proof 


We have observed that S has a primitive recursive vocabulary and axiom 
set. By Exercise 3.36(c) and the already noted fact that the identity relation is 
expressible in S, we see that Proposition 3.29 entails that every function rep- 
resentable in S is recursive. On the other hand, Proposition 3.24 tells us that 
every recursive function is representable in S. 

In Chapter 5, it will be made plausible that the notion of recursive function 
is a precise mathematical equivalent of the intuitive idea of effectively comput- 
able function. 


Corollary 3.31 


A number-theoretic relation R(x, ..., x, is recursive if and only if it is express- 
ible in S. 


Proof 


By definition, R is recursive if and only if Cz is recursive. By Corollary 3.30, 
Cz is recursive if and only if Cz is representable in S. But, by Proposition 3.13, 
Cz is representable in S if and only if R is expressible in S. 

It will be helpful later to find weaker theories than S for which the repre- 
sentable functions are identical with the recursive functions. Analysis of the 
proof of Proposition 3.24 leads us to the following theory. 


Robinson’s System 


Consider the theory in the language 4, with the following finite list of proper 
axioms. 


1. yp Sy 

2. Xp =X) DX =X, 

3. Xp = Xp > (X) = Xz > Xy = Xs) 

4. xy =%,>x,' =x,’ 

5. X= Xp > (XN +X, =X. +X3 A XZ 4+ Xy = Xz + Xp) 
6. Xy =X > (Xp Hg = Hy + Xz AXZ° Hy = Xz + Xp) 

7. Xy =X_ > xX, =X, 

8. OF x,’ 

9. x, #0 => (Ax,)(x, = xy’) 

10. x, +0=x, 
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11. x, +x)! = (xX, + x)’ 

12. x,-0=0 

13. 24° Xo! = (Xy- Hy) +4 

14. (%) =X X3 + X~ AX <X_AX_ = Ap Ay +X A Xe <Xy) > Xy = Xe (unique- 
ness of remainder) 


We shall call this theory RR. Clearly, RR is a subtheory of 5, since all the axioms 
of RR are theorems of S. In addition, it follows from Proposition 2.25 and axioms 
(1)-(6) that RR is a theory with equality. (The system Q of axioms (1)-(13) is due 
to R.M. Robinson (1950). Axiom (14) has been added to make one of the proofs 
below easier.) Notice that RR has only a finite number of proper axioms. 


Lemma 3.32 


In RR, the following are theorems. 


a. 1+m=n-+m for any natural numbers n and m 
b. n-m=n-m for any natural numbers n and m 
c. 17m for any natural numbers such that n 4 m 
d. n<m for any natural numbers n and m such that n<m 
e. x £0 = 
f. xSno>x=Ovx=1v...vx=7n for any natural number n 
g. x<nvn <x for any natural number n 
Proof 


Parts (a)-(c) are proved the same way as Proposition 3.6(a). Parts (d)-(g) are 
left as exercises. 


Proposition 3.33 


All recursive functions are representable in RR. 


Proof 


The initial functions Z, N, and U;’ are representable in RR by the same wfs 
as in Examples 1-3, page 171. That the substitution rule does not lead out of 
the class of functions representable in RR is proved in the same way as in 
Example 4 on page 172. For the recursion rule, first notice that B(x, x2, x3) is 
represented in RR by Bt(x,, x2, x3, y) and that Frp Bt(Xy, Xo, X3, Y) A Bt(X1, X5, X3, 2) 
=> y =z. Reasoning like that in the proof of Proposition 3.24 shows that the 
recursion rule preserves representability in RR.* The argument given for the 
restricted -operator rule also remains valid for RR. 


* This part of the argument is due to Gordon McLean, Jr. 
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By Proposition 3.33, all recursive functions are representable in any exten- 
sion of RR. Hence, by Proposition 3.29 and Exercise 3.36(c), in any consistent 
extension of RR in the language », that has a recursive axiom set, the class 
of representable functions is the same as the class of recursive functions. 
Moreover, by Proposition 3.13, the relations expressible in such a theory are 
the recursive relations. 


Exercises 


3.379 Show that RR is a proper subtheory of S. [Hint: Find a model for RR 


3.38 


3.39 


3.40 


that is not a model for S.] (Remark: Not only is S different from RR, but 
it is not finitely axiomatizable at all, that is, there is no theory K having 
only a finite number of proper axioms, whose theorems are the same 
as those of S. This was proved by Ryll-Nardzewski, 1953.) 


Show that axiom (14) of RR is not provable from axioms (1)-(13) and, 
therefore, that Q is a proper subtheory of RR. [Hint: Find a model of 
(1)-(13) for which (14) is not true.] 

Let K be a theory in the language , with just one proper axiom: (Vx) 
(Wx5)X1 = Xp. 

a. Show that K is a consistent theory with equality. 

b. Prove that all number-theoretic functions are representable in K. 


c. Which number-theoretic relations are expressible in K? [Hint: Use 
elimination of quantifiers.] 


d. Show that the hypothesis Fy, 0 4 1 cannot be eliminated from 
Proposition 3.13. 


e. Show that, in Proposition 3.29, the hypothesis (*) cannot be 
replaced by the assumption that K is consistent. 


Let R be the theory in the language », having as proper axioms the 
equality axioms (1)—(6) of RR as well as the following five axiom sche- 
mas, in which n and m are arbitrary natural numbers: 

(Rl) #+m=n+m 

(R2) 7-m=n-m 

(R3)n4zm ifnem 

(R4)xSn>x=0vV...vx=n 

(R5)x<nvns<x 

Prove the following. 


a. Ris not finitely axiomatizable. [Hint: Show that every finite subset 
of the axioms of R has a model that is not a model of R.] 


b. Risa proper subtheory of Q. 
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c.° Every recursive function is representable in R. (Gee Monk, 1976, p. 248.) 
d. The functions representable in R are the recursive functions. 
e. The relations expressible in R are the recursive relations. 


3.5 The Fixed-Point Theorem: Godel’s Incompleteness Theorem 


If Kis a theory in the language ~,, recall that the diagonal function D has the 
property that, if u is the Gédel number of a wf .7 (x,), then D(u) is the Gédel 
number of the wf .7 (i). 


Notation 


When ~ is an expression of a theory and the Gédel number of vis q, then we 
shall denote the numeral g by "7 '. We can think of "7 ‘as being a “name” for 
“within the language ,. 


Proposition 3.34 (Diagonalization Lemma) 


Assume that the diagonal function D is representable in a theory with equal- 
ity K in the language 4. Then, for any wf “(x,) in which x, is the only free 
variable, there exists a closed wf 7” such that 

Hx CDE ("e ) 
Proof 


Let “(x1 x2) be a wf representing D in K. Construct the wf 


(V)(Vx2)( 7 (1,42) = « (x2)) 
Let m be the Godel number of (V). Now substitute m for x, in (V): 
(- )(Wx2)( 7 (1, *2) => « (22) 
Let g be the Gédel number of this wf 7. So, 7 is" 7 '. Clearly, D(m) = q. (In fact, 


m is the Gédel number of a wf .4(x,), namely, (V), and q is the Gédel number 
of .7(im).) Since 7 represents D in K, 


(0) tk 7(m,q) 
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a. Let us show kk 7 > (q). 


1. ¢g Hyp 

D: (Vx2)( Z(M,X2) => # (x2)) Same as 1 

3. 9(m,q)>~«(q) 2, rule A4 

4. 7(m,q) (0) 

5. «(q) 3, 4, MP 

6. ike (q) 1-5 

7. kk =><(q) 1-6, Corollary 2.6 
b. Let us prove lk “(q)=> 7 

1. «(q) Hyp 

2. 9(m,x2) Hyp 

3. (Aix) 7 (m1, x2) 7 represents D 

4. 9 ( m,@) (0) 

5. X2. =G 2-4, properties of = 

6. “(x2) 1, 5, substitutivity of = 

7. C ‘(q), 7 ( (m, X2) )hk G (x2) 1-6 

8. 4 (q bk 9 (iM, x2) => #(x2) 1-7, Corollary 2.6 

9. «(G (Vx2)(9 (i, x2) > « (x2)) 8, Gen 


4 


TK Cc 


11. kK 


ray 
> 


— (Vx2 Ne 7 (MM, X2) => « (x2)) 1-9, Corollary 2.6 


) 
he 
(7 
(7 


= Same as 10 


From parts (a) and (b), by biconditional introduction, kk 7 = (q). 


Proposition 3.35 (Fixed-Point Theorem)* 


Assume that all recursive functions are representable in a theory with equal- 
ity K in the language ,. Then, for any wf “(x,) in which x, is the only free 
variable, there is a closed wf “such that 


Hx CDE (Te ) 


* The terms “fixed-point theorem” and “diagonalization lemma” are often used interchange- 
ably, but I have adopted the present terminology for convenience of reference. The central 
idea seems to have first received explicit mention by Carnap (1934), who pointed out that the 
result was implicit in the work of Gédel (1931). The use of indirect self-reference was the key 
idea in the explosion of progress in mathematical logic that began in the 1930s. 
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Proof 


By Proposition 3.27, D is recursive.” Hence, D is representable in K and 
Proposition 3.34 is applicable. 

By Proposition 3.33, the fixed-point theorem holds when K is RR or any 
extension of RR. In particular, it holds for S. 


Definitions 


Let K be any theory whose language contains the individual constant 0 and 
the function letter f'. Then K is said to be w-consistent if, for every wf 7 (x) of 
K containing x as its only free variable, if x —.7 (7) for every natural number 
n, then it is not the case that Fy (Ax).7 (x). 

Let K be any theory in the language . K is said to be a true theory if all 
proper axioms of K are true in the standard model. (Since all logical axioms 
are true in all models and MP and Gen lead from wfs true in a model to wfs 
true in that model, all theorems of a true theory will be true in the standard 
model.) 

Any true theory K must be w-consistent. (In fact, if kk 4.7 (71) for all natural 
numbers n, then .7 (x) is false for every natural number and, therefore, (4x).7 
(x) cannot be true for the standard model. Hence, (Ax).4 (x) cannot be a theo- 
rem of K.) In particular, RR and S are w-consistent. 


Proposition 3.36 


If K is w-consistent, then K is consistent. 


Proof 


Let «(x) be any wf containing x as its only free variable. Let .7(x) be «(x) A 
a(x). Then —.7 (7) is an instance of a tautology. Hence, x =.7 (7) for every 
natural number n. By w-consistency, not-F, (Ax).7 (x). Therefore, K is consis- 
tent. (Remember that every wf is provable in an inconsistent theory, by virtue 
of the tautology -A => (A = B). Hence, if at least one wf is not provable, the 
theory must be consistent.) 

It will turn out later that the converse of Proposition 3.36 does not hold. 


* In fact, D is primitive recursive, since K, being a theory in 4, has a primitive recursive 
vocabulary. 
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Definition 


An undecidable sentence of a theory K is a closed wf .vof K such that neither 7 
nor —.Zis a theorem of K, that is, such that not-F, .zand not-F, 7.7. 


Godel’s Incompleteness Theorem 


Let K be a theory with equality in the language , satisfying the following 
three conditions: 


1. Khas a recursive axiom set (that is, PrAx(y) is recursive). 
2. Kk O¥1. 
3. Every recursive function is representable in K. 


By assumption 1, Propositions 3.26—3.28 are applicable. By assumptions 
2 and 3 and Proposition 3.13, every recursive relation is expressible in K. 
By assumption 3, the fixed-point theorem is applicable. Note that K can 
be taken to be RR, S, or, more generally, any extension of RR having a 
recursive axiom set. Recall that Pf(y, x) means that y is the Gédel number 
of a proof in K of a wf with Gédel number x. By Proposition 3.28, Pf is 
recursive. Hence, Pf is expressible in K by a wf .77(x», X,). Let “(x,) be the 
wef (Vx.) 2.7 .7(X2, X,). By the fixed-point theorem, there must be a closed 
wf “such that 


($) Hg G > (Vx2)AY (2,797). 


Observe that, in terms of the standard interpretation, (Vx2)—.7 (2,71) says 
that there is no natural number that is the G6del number of a proof in K of the 
wf , which is equivalent to asserting that there is no proof in K of 7. Hence, 
cis equivalent in K to an assertion that “is unprovable in K. In other words, 
¢says “Iam not provable in K”. This is an analogue of the liar paradox: “Iam 
lying” (that is, “I am not true”). However, although the liar paradox leads to 
a contradiction, Gédel (1931) showed that 7 is an undecidable sentence of K. 
We shall refer to 7 as a Gédel sentence for K. 


Proposition 3.37 (Godel’s Incompleteness Theorem) 


Let K satisfy conditions 1-3. Then 


a. If K is consistent, not-F, @. 
b. If K is w-consistent, not-F, 77%. 


Hence, if K is w-consistent, “is an undecidable sentence of K. 
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Proof 


Let q be the Gédel number of . 


a. Assume}, %. Let r be the Gédel number of a proof in K of v. Then 
Pf(r, q). Hence, tk .7/(¥,7q), that is kk 7/(7,"77). But, from ($) 
above by biconditional elimination, F (Vx) 7.4/(%z, [7 ]). By rule A4, 
tk ay (7,77). Therefore, K is inconsistent. 

b. Assume K is w-consistent and Fx 7%. From (§) by biconditional elim- 
ination, Fy 7(VXx)7.7/(x2, "7, which abbreviates to 


(*) He (8x2). 47 (%2,°97) 


On the other hand, since K is w-consistent, Proposition 3.36 implies that K 
is consistent. But, +, 7%. Hence, not-F, “, that is, there is no proof in K of @. 
So, Pf(n, q) is false for every natural number n and, therefore, k A.y (7, Ug a) 
for every natural number n. (Remember that "@ is 7.) By w-consistency, not- 
F (AX). 4/(Xy, “F), contradicting (*). 


Remarks 


Gédel’s incompleteness theorem has been established for any theory with 
equality K in the language , that satisfies conditions 1-3 above. Assume 
that K also satisfies the following condition: 


(+) Kisa true theory. 


(In particular, K can be S or any subtheory of S.) Proposition 3.37(a) shows 
that, if K is consistent, “is not provable in K. But, under the standard inter- 
pretation, 7 asserts its own unprovability in K. Therefore, 7 is true for the 
standard interpretation. 

Moreover, when K is a true theory, the following simple intuitive argu- 
ment can be given for the undecidability of in K. 


i. Assumekx, @% Since Fy 7S (WX) AIAX, "A, it follows that F,(Vx,) 
APY (Xo, "7). Since K is a true theory, (VX) 7.7/(X2, ") is true 
for the standard interpretation. But this wf says that 7 is not 
provable in K, contradicting our original assumption. Hence, 
not-ky @ 

iii Assumely 7%. Since Fy 7 (Vxy) AIAA(Xy, A), Fe A(WX9) A PA(Xy, "F). 
So, Fx (AX5).4/(%2, “7). Since K is a true theory, this wf is true for the 
standard interpretation, that is, ~is provable in K. This contradicts 
the result of (i). Hence, not-Fy 77%. 
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Exercises 


3.41 Let ~be a Gédel sentence for S. Let S. be the extension of S obtained 
by adding ~~ as a new axiom. Prove that, if S is consistent, then S, is 
consistent, but not w-consistent. 


3.42 A theory K whose language has the individual constant 0 and func- 
tion letter f/ is said to be w-incomplete if there is a wf «(x) with one free 
variable x such that tx “ (71) for every natural number 1, but it is not 
the case that Fx (Vx) (x). If K is a consistent theory with equality in the 
language », and satisfies conditions 1-3 on page 208, show that K is 
w-incomplete. (In particular, RR and S are w-incomplete.) 


3.43 Let K be a theory whose language contains the individual constant 0 
and function letter fi. Show that, if K is consistent and w-inconsistent, 
then K is w-incomplete. 


3.44 Prove that S, as well as any consistent extension of S having a recursive 
axiom set, is not a scapegoat theory. (Gee page 85.) 

3.45 Show that there is an w-consistent extension K of S such that K is not a 
true theory. [Hint: Use the fixed point theorem.] 


The Godel—Rosser Incompleteness Theorem 


The proof of undecidability of a Gédel sentence 7 required the assumption 
of w-consistency. We will now prove a result of Rosser (1936) showing that, at 
the cost of a slight increase in the complexity of the undecidable sentence, the 
assumption of -consistency can be replaced by consistency. 

As before, let K be a theory with equality in the language , satisfying 
conditions 1-3 on page 208. In addition, assume: 


4.4 
5. A 


>x=Ovx=1v...vx=7 for every natural number n. 
vn <x for every natural number n. 


RR 


x<n 
x<n 


Thus, K can be any extension of RR with a recursive axiom set. In particular, 
K can be RR or S. 

Recall that, by Proposition 3.26 (14), Neg is a primitive recursive function 
such that, if x is the Gédel number of a wf .4, then Neg(x) is the Gédel num- 
ber of (-.4). Since all recursive functions are representable in K, let. (x1, X») 
be a wf that represents Neg in K. Now construct the following wf ¢ (x,): 


(Vaxa)(7 (2-21) = (Va) (> (x12) => (Bea) (x4 Sma 0.7 (24,%3)))) 


By the fixed-point theorem, there is a closed wf .7such that 


*) ke ABE AF! 
(*) ("7") 
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vis called a Rosser sentence for K. Notice what the intuitive meaning of is 

under the standard interpretation. 7 asserts that, if 7 has a proof in K, say 
with Gédel number x,, then —.vhas a proof in K with Gédel number smaller 
than x. This is a roundabout way for to claim its own unprovability under 
the assumption of the consistency of K. 


Proposition 3.38 (Godel—Rosser Theorem) 


Let K satisfy conditions 1-5. If K is consistent, then is an undecidable sen- 
tence of K. 


Proof 


Let p be the Gédel number of .v. Thus, "7 is p. Let j be the Godel number 
of 7. 


a. Assume, % Since ky ¥=> «("#), biconditional elimination yields 
Fy (CA), that is 


He (vx2)( ¥ (X2,p)> (vx5)/ 2 (P,X3) > (Axs)(x4 <X> ».7 (%4,%3)))) 


Let k be the Gédel number of a proof in K of .7. Then Pf(k, p) and, 
therefore, kk Y (k, Pp). Applying rule A4 to «("7)), we obtain 


Kk f (k.7) = (vxs)( ‘4 (P,x3) => (Axa) (x4 <KkAY (xs,5))) 


So, by MP, 
(2%) He (Wxs)(- (P,x3) = (Sxa)(x4 SK Ay (4,23) 


Since j is the Gédel number of —=¥% we have .1,(p, j), and, 
therefore, !k (B, i). Applying rule A4 to (%), we obtain 


He “(B,7) = (Axa)(x4 sk A 4 (x4,j ). Hence, by MP, tx (4x1) 


(xs SkAY (x4,] )), which is an abbreviation for 


(#) kk A(Vx4) als <kAY (x4,j )) 


Since Fy .¥, the consistency of K implies not-F, +. Hence, Pf(n, j) is 
false for all natural numbers n. Therefore, kk 3.7 (7, j ) for all natural 
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numbers n. Since K is a theory with equality, *k %4=">7.7 (x4,7) 
for all natural numbers n. By condition 4, 


(f) Ke x4 <k > x, =0Vxyelv...vxyek 
But 


i) aaa Se ar) PRON 


So, by a suitable tautology, (/) and ( yield kk x4 Sk >Ay (x1,7) 


and then, by another tautology, hk A(x4 Skay (xa, j )). By Gen, 


He (Vxs) (xs <kAY (xs, i) . This, together with (#), contradicts 
the consistency of K. 


. Assumel, —¥%. Let m be the Gédel number of a proof of = ¥ in K. So, 


Pf£(m,j)istrueand, therefore, .7 (mi, j )-Hence, by anapplicationofrule 
E4 and the deduction theorem, fk 1 < x, > (Ax) Xq SWAY (xs/7)) 
By consistency of K, not+, » and, therefore, Pf(n, p) is false for all 
natural numbers n. Hence, kk =.7 (71,p) for all natural numbers 
n. By condition 4, kk x.<m—>x.=O0V%x.=1Vv...Vx.=m. Hence, 
Kk %2 SMD Ay (x2, p) . Consider the following derivation. 


1. 9(X2,p) Hyp 

2. (Pp, xs) Hyp 

3. Xo SMVMSX Condition 5 
4, M<x,=>(3x4)(t4<m2A.7(x4,7)) Proved above 
5. X2Sim=> ay (X2,p) Proved above 


6. AY (x2, p)v (ax (xa <a y(xa,7)) 3-5, tautology 
) 


7, (Axa)(xa<x20.7(x4,7)) 1, 6, disjunction rule 

8. ly (Pp, - Proved in part (a) 

9. (5x3). (P,xs) ', represents Neg 
10. x3;=] 2, 8, 9, properties of = 
11. (Axs)(x4 KX AY (x4,%s)) 7, 10, substitutivity of = 


12. Y(x2,p),.(f,%3) (A 1-11 


(x4 SX NY (x4,xs)) 


eR 
rs 
— 
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13. AY (X2,p) He eo (P, X3) => (4X4) 1-12, Corollary 2.6 
(X4 SX APY (X4,X3)) 

14. 7 (X2, P) hq (Vx3)( 4 (PB, x3) 13, Gen 
=> (Ax4)(X4 SX. APY (X4,X3))) 


15. Ky AY (X2,P) > (V%3)( 5 (3) 1-14, Corollary 2.6 
=> (Ax4)(X4 S XQ AP /(X4,%X3))) 
16. Fk (Vx2)(Y (x2, P) > (V3) 7 (P, Xs) 15, Gen 


> (Axy)(%4 S X_ A 7.7 (Xa, X3)))) 
17 Fy A (, 16,  biconditional 
elimination) 


Thus, kx vand kx —¥%, contradicting the consistency of K. 


The Gédel and Rosser sentences for the theory S are undecidable sentences 
of S. They have a certain intuitive metamathematical meaning; for exam- 
ple, a Gédel sentence ” asserts that “is unprovable in S. Until recently, no 
undecidable sentences of S were known that had intrinsic mathematical 
interest. However, in 1977, a mathematically significant sentence of combi- 
natorics, related to the so-called finite Ramsey theorem, was shown to be 
undecidable in S (see Kirby and Paris, 1977; Paris and Harrington, 1977; and 
Paris, 1978). 


Definition 


A theory K is said to be recursively axiomatizable if there is a theory K* having 
the same theorems as K such that K* has a recursive axiom set. 


Corollary 3.39 


Let K bea theory in the language »,. If K is a consistent, recursively axiomat- 
izable extension of RR, then K has an undecidable sentence. 


Proof 


Let K* bea theory having the same theorems as K and such that K* has a recur- 
sive axiom set. Conditions 1-5 of Proposition 3.38 hold for K*. Hence, a Rosser 
sentence for K* is undecidable in K* and, therefore, also undecidable in K. 

An effectively decidable set of objects is a set for which there is a mechanical 
procedure that determines, for any given object, whether or not that object 
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belongs to the set. By a mechanical procedure we mean a procedure that is 
carried out automatically without any need for originality or ingenuity in 
its application. On the other hand, a set A of natural numbers is said to be 
recursive if the property x € A is recursive. The reader should be convinced 
after Chapter 5 that the precise notion of recursive set corresponds to the intuitive 
idea of an effectively decidable set of natural numbers. This hypothesis is known 
as Church's thesis. 

Remember that a theory is said to be axiomatic if the set of its axioms 
is effectively decidable. Clearly, the set of axioms is effectively decidable 
if and only if the set of Gddel numbers of axioms is effectively decidable 
(since we can pass effectively from a wf to its Gddel number and, con- 
versely, from the Gédel number to the wf). Hence, if we accept Church’s 
thesis, to say that K has a recursive axiom set is equivalent to saying that K 
is an axiomatic theory, and, therefore, Corollary 3.39 shows RR is essentially 
incomplete, that is, that every consistent axiomatic extension of RR has an 
undecidable sentence. This result is very disturbing; it tells us that there 
is no complete axiomatization of arithmetic, that is, there is no way to set 
up an axiom system on the basis of which we can decide all problems of 
number theory. 


Exercises 


3.46 Church’s thesis is usually taken in the form that a number-theoretic func- 
tion is effectively computable if and only if it is recursive. Prove that this is 
equivalent to the form of Church's thesis given above. 


3.47 Let K bea true theory that satisfies the hypotheses of the Gddel-Rosser 
theorem. Determine whether a Rosser sentence .¥ for K is true for the 
standard interpretation. 


3.48 (Church, 1936b) Let Tr be the set of G6del numbers of all wfs in the lan- 
guage ¥, that are true for the standard interpretation. Prove that Tr is 
not recursive. (Hence, under the assumption of Church’s thesis, there is 
no effective procedure for determining the truth or falsity of arbitrary 
sentences of arithmetic.) 

3.49 Prove that there is no recursively axiomatizable theory that has Tr as 
the set of G6del numbers of its theorems. 

3.50 Let K be a theory with equality in the language », that satisfies condi- 
tions 4 and 5 on page 210. If every recursive relation is expressible in K, 
prove that every recursive function is representable in K. 


* To say that x € A is recursive means that the characteristic function C, is a recursive function, 
where C,(x) = 0 if x € A and C,(x) = 1if x € A (see page 180). 
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Godel’s Second Theorem 


Let K be an extension of S in the language 4 such that K has a recursive 
axiom set. Let ~., be the following closed wf of K: 


(Vx1)(Vx2)(Vx3)(Vxa) aC (x1,X%3)A.Y (X2,Xa) A 1 (x3,X4)) 


For the standard interpretation, ~.., asserts that there are no proofs in K of a 
wf and its negation, that is, that K is consistent. 
Consider the following sentence: 


(G) Cong =e 


where “is a Gédel sentence for K. Remember that “ asserts that 7 is unprov- 
able in K. Hence, (G) states that, if K is consistent, then ~ is not provable in 
K. But that is just the first half of Gédel’s incompleteness theorem. The meta- 
mathematical reasoning used in the proof of that theorem can be expressed 
and carried through within K itself, so that one obtains a proof in K of (G) 
(see Hilbert and Bernays, 1939, pp. 285-328; Feferman, 1960). Thus, Fy “4% > % 
But, by Gédel’s incompleteness theorem, if K is consistent, “is not provable 
in K. Hence, if K is consistent, 7... is not provable in K. 

This is Gédel’s second theorem (1931). One can paraphrase it by stating that, 
if K is consistent, then the consistency of K cannot be proved within K, or, 
equivalently, a consistency proof of K must use ideas and methods that go 
beyond those available in K. Consistency proofs for S have been given by 
Gentzen (1936, 1938) and Schtitte (1951), and these proofs do, in fact, employ 
notions and methods (for example, a portion of the theory of denumerable 
ordinal numbers) that apparently are not formalizable in S. 

Gédel’s second theorem is sometimes stated in the form that, if a “sufficiently 
strong” theory K is consistent, then the consistency of K cannot be proved 
within K. Aside from the vagueness of the “sufficiently strong” (which can be 
made precise without much difficulty), the way in which the consistency of 
K is formulated is crucial. Feferman (1960, Cor. 5.10) has shown that there is a 
way of formalizing the consistency of S—say, ~..,*—such that k. ~~". A pre- 
cise formulation of Gédel’s second theorem may be found in Feferman (1960). 
(See Jeroslow 1971, 1972, 1973) for further clarification and development.) 

In their proof of Gédel’s second theorem, Hilbert and Bernays (1939) based 
their work on three so-called derivability conditions. For the sake of definite- 
ness, we shall limit ourselves to the theory S, although everything we say also 
holds for recursively axiomatizable extensions of S. To formulate the Hilbert 
Bernays results, let z.(x,) stand for (Ax,).7/(%, x). Thus, under the standard 
interpretation, .z.(x,) means that there is a proof in S of the wf with Gédel 
number x,; that is, the wf with Godel number x, is provable in S.* Notice that a 
Gédel sentence “ for S satisfies the fixed-point condition: kz 7 74.(°?). 


* Bew” consists of the first three letters of the German word beweisbar, which means “provable.” 
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The Hilbert-Bernays Derivability Conditions* 


(HB1) If ts 7, then by z.("77) 


(HB2) bs Pou @ >I es Buu E ies Buu" J 7) 


(HB3) bs Bou Z Yes Bow hy Bou % ay) 


Here, “and / are arbitrary closed wfs of S. (HB1) is straightforward and 
(HB2) is an easy consequence of properties of .7 However, (HB3) requires 
a careful and difficult proof. (A clear treatment may also be found in Boolos 
(1993, Chapter 2), and in Shoenfield (1967, pp. 211-213).) 

A Gédel sentence # for S asserts its own unprovability in S: 5 7@ 42(°7). 
We also can apply the fixed-point theorem to obtain a sentence 7 such that 
bo 7 ZA). vis called a Henkin sentence for S. 7 asserts its own provability 
in S. On intuitive grounds, it is not clear whether 7 is true for the standard 
interpretation, nor is it easy to determine whether 7 is provable, disprovable 
or undecidable in S. The problem was solved by L6b (1955) on the basis of 
Proposition 3.40 below. First, however, let us introduce the following conve- 
nient abbreviation. 


Notation 


Let Fv stand for 4.77), where 7 is any wf. Then the Hilbert-Bernays deriv- 
ability conditions become 


(HB1) If ty 7, then kz 


(HB2) hO(7 > 7)>(O7 =O) 


(HB3) kOe 00 


The Gédel sentence “ and the Henkin sentence 7 satisfy the equivalences 
bev@aLivand+, ve] % 


Proposition 3.40 (Lob’s Theorem) 


Let 7be a sentence of S. If 57> 7 then kg 7 


* These three conditions are simplifications by Léb (1955) of the original Hilbert-Bernays 
conditions. 
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Proof 


Apply the fixed-point theorem to the wf .z.(x,) > ~ to obtain a sentence » 
such that Fy 7@ (4AY) => 7). Thus, ks 7@ (F]7=> 7). Then we have the fol- 
lowing derivation of 7. 


1. bkeveOyv>” Obtained above 
2, kp v>(Ov> 7) 1, biconditional 
elimination 
3 FO (v= (v= 4) 2, (HB1) 
4,.+>0O%7=0(07> 7 3, (HB2), MP 
5. ksOo(Ov=%> 00720” (HB2) 
6 tkFOv=> (COv=07) 4, 5 tautology 
7 teOv=> OO (HB3) 
8 FFOv=>Oz 6, 7, tautology 
9 tOv>7 Hypothesis of the theorem 
10. Fs Ov> 7 8, 9, tautology 
1. Fe y 1, 10, biconditional 
elimination 
12. ke Oy 11, (HB1) 
13. Fe 7 10, 12, MP 
Corollary 3.41 


Let 7 be a Henkin sentence for S. Then, 7 and 7 is true for the standard 
interpretation. 
Proof 


ts 7@(Q ~» By biconditional elimination, -;.]] 7=> ~ So, by Lob’s theorem, 
Fs 7. Since vasserts that 7 is provable inS, 7 is true 

Lob’s theorem also enables us to give a proof of Gédel’s second theorem 
for S. 


Proposition 3.42 (Godel’s Second Theorem) 


If S is consistent, then not-kg ~.,s. 


Proof 


Assume S consistent. Since f; 0 # 1, the consistency of S implies not+; 0 = 1. By 


Léb’s theorem, not-+s (0 = 1) = 0 = 1. Hence, by the tautology =A => (A = B), 
we have: 


(+) not+s; - (0=1) 
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But, since ty 0 #1, (HB1) yields kk (0 # 1). Then it is easy to show that 
by Zog => (0 = 1): So, by (*), notts ~s. 

Boolos (1993) gives an elegant and extensive study of the fixed-point theo- 
rem and Léb’s theorem in the context of an axiomatic treatment of provabil- 


ity predicates. Such an axiomatic approach was first proposed and developed 
by Magari (1975). 


Exercises 


3.51 Prove (HB1) and (HB2). 
3.52 Give the details of the proof of ky 13 > 3.2. (ro = 17) , Which was used 
in the proof of Proposition 3.42. 


3.53 If ~ is a Gédel sentence of 5, prove kk 4 @ 4.4, (" O= 1), (Hence, any 
two Gédel sentences for S are provably equivalent. This is an instance 
of a more general phenomenon of equivalence of fixed-point sentences, 
first noticed and verified independently by Bernardi (1975, 1976), 
De Jongh and Sambin (1976). See Smoryniski (1979, 1982). 


3.54 In each of the following cases, apply the fixed-point theorem for S 
to obtain a sentence of the indicated kind; determine whether that 
sentence is provable in S, disprovable in S, or undecidable in S; 
and determine the truth or falsity of the sentence for the standard 
interpretation. 


a. A sentence ~ that asserts its own decidability in S (that is, that F. 7 
or Fg 77). 

b. Asentence that asserts its own undecidability in S. 

c. Asentence / asserting that not-F. 77. 

d. Asentence / asserting that F, 77. 


Dr 


3.6 Recursive Undecidability: Church’s Theorem 
If K is a theory, let T, be the set of Gédel numbers of theorems of K. 


Definitions 


K is said to be recursively decidable if T, is a recursive set (that is, the property 
x € Ty is recursive). K is said to be recursively undecidable if T, is not recursive. 
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K is said to be essentially recursively undecidable if K and all consistent exten- 
sions of K are recursively undecidable. 

If we accept Church’s thesis, then recursive undecidability is equivalent to 
effective undecidability, that is, nonexistence of a mechanical decision pro- 
cedure for theoremhood. The nonexistence of such a mechanical procedure 
means that ingenuity is required for determining whether arbitrary wfs are 
theorems. 


Exercise 


3.55 Prove that an inconsistent theory having a recursive vocabulary is 
recursively decidable. 


Proposition 3.43 


Let K be a consistent theory with equality in the language », in which 
the diagonal function D is representable. Then the property x € T, is not 
expressible in K. 


Proof 


Assume x € Tx is expressible in K by a wf .7(x,). Thus 


a. lfne Tx, kK 7 (71). 

b. lfng¢g Tx, He 7 7 (it). 
By the diagonalization lemma applied to = 7(x,), there is a sentence 
such that ky 7 77("/). Let g be the Gédel number of 7. So 

chk ee@usz(q). 


Case 1: Fy v. Then q € Tx. By (a), k 7 (q). But, from Fy “and (0), by bicon- 
ditional elimination, x +7 (7). Hence K is inconsistent, contradicting our 
hypothesis. 


Case 2: not-, 7. So, q € Tx. By (b), kk =.7 (7). Hence, by (0) and biconditional 
elimination, Fy 7 
Thus, in either case a contradiction is reached. 


Definition 


A set B of natural numbers is said to be arithmetical if there is a wf .7(x) in the 
language 4, with one free variable x, such that, for every natural number n, 
n & Bif and only if (7) is true for the standard interpretation. 
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Corollary 3.44 [Tarski’s Theorem (1936)] 


Let Tr be the set of Gddel numbers of wfs of S that are true for the standard 
interpretation. Then Tr is not arithmetical. 


Proof 


Let ./ be the extension of S that has as proper axioms all those wfs that 
are true for the standard interpretation. Since every theorem of ./ must be 
true for the standard interpretation, the theorems of ./ are identical with 
the axioms of ./. Hence, T , = Tr. Thus, for any closed wf .4, .7 holds for the 
standard interpretation if and only if + ,.~% It follows that a set B is arithmeti- 
cal if and only if the property x € B is expressible in. /. We may assume that 

‘is consistent because it has the standard interpretation as a model. Since 
every recursive function is representable in S, every recursive function is 
representable in ./ and, therefore, D is representable in .. By Proposition 
3.43, x € Tr is not expressible in. /. Hence, Tr is not arithmetical. (This result 
can be roughly paraphrased by saying that the notion of arithmetical truth is 
not arithmetically definable.) 


Proposition 3.45 


Let K be a consistent theory with equality in the language ~, in which all 
recursive functions are representable. Assume also that kx 0 # 1. Then K is 
recursively undecidable. 


Proof 


D is primitive recursive and, therefore, representable in K. By Proposition 
3.43, the property x € T, is not expressible in K. By Proposition 3.13, the char- 
acteristic function C7, is not representable in K. Hence, Cx, is not a recursive 
function. Therefore, Tx is not a recursive set and so, by definition, K is recur- 
sively undecidable. 


Corollary 3.46 


RR is essentially recursively undecidable. 


Proof 


RR and all consistent extensions of RR satisfy the conditions on K in 
Proposition 3.45 and, therefore, are recursively undecidable. (We take for 
granted that RR is consistent because it has the standard interpretation as 
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a model. More constructive consistency proofs can be given along the same 
lines as the proofs by Beth (1959, § 84) or Kleene (1952, § 79).) 

We shall now show how this result can be used to give another derivation 
of the Gédel-Rosser theorem. 


Proposition 3.47 


Let K be a theory with a recursive vocabulary. If K is recursively axiomatiz- 
able and recursively undecidable, then K is incomplete (i.e., K has an unde- 
cidable sentence). 


Proof 


By the recursive axiomatizability of K, there is a theory J with a recursive 
axiom set that has the same theorems as K. Since K and J have the same theo- 
rems, T, = T, and, therefore, J is recursively undecidable, and K is incomplete 
if and only if J is incomplete. So, it suffices to prove J incomplete. Notice that, 
since K and J have the same theorems, J and K must have the same individual 
constants, function letters, and predicate letters (because all such symbols 
occur in logical axioms). Thus, the hypotheses of Propositions 3.26 and 3.28 
hold for J. Moreover, J is consistent, since an inconsistent theory with a recur- 
sive vocabulary is recursively decidable. 

Assume J is complete. Remember that, if x is the Gédel number of a wf, 
Clos(x) is the Gédel number of the closure of that wf. By Proposition 3.26 (16), 
Clos is a recursive function. Define: 


A(x) = py[(Fml(x) a (Pf(y, Clos(x)) v Pf(y, Neg(Clos(x))))) v =Fml(x)] 


Notice that, if x is not the Gddel number of a wf, H(x) = 0. If x is the Gédel 
number of a wf .%, the closure of .7 is a closed wf and, by the completeness of 
J, there is a proof in J of either the closure of .7 or its negation. Hence, H(x) is 
obtained by a legitimate application of the restricted j)-operator and, therefore, 
H is a recursive function. Recall that a wf is provable if and only if its closure 
is provable. So, x € T, if and only if Pf(H(x), Clos(x)). But Pf(H(x), Clos(x)) is 
recursive. Thus, T, is recursive, contradicting the recursive undecidability of J. 

The intuitive idea behind this proof is the following. Given any wf .4, we 
form its closure ~ and start listing all the theorems in J. (Since PrAx is recur- 
sive, Church's Thesis tells us that J is an axiomatic theory and, therefore, by 
the argument on page 84, we have an effective procedure for generating all 
the theorems.) If J is complete, either 7 or -7 will eventually appear in the 
list of theorems. If 7 appears, .7is a theorem. If =” appears, then, by the con- 
sistency of J, ~ will not appear among the theorems and, therefore, .7 is not 
a theorem. Thus, we have a decision procedure for theoremhood and, again 
by Church’s thesis, J would be recursively decidable. 
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Corollary 3.48 (Godel-Rosser Theorem) 


Any consistent recursively axiomatizable extension of RR has undecidable 
sentences. 


Proof 


This is an immediate consequence of Corollary 3.46 and Proposition 3.47. 


Exercises 


3.56 Prove that a recursively decidable theory must be recursively axiomatizable. 


3.57 Let K be any recursively axiomatizable true theory with equality. 
(So, Tx € Tr) Prove that K has an undecidable sentence. [Hint: Use 


Proposition 3.47 and Exercise 3.48.] 


3.58 Two sets A and B of natural numbers are said to be recursively inseparable 
if there is no recursive set C such that A C Cand BCC. (C is the com- 
plement  —C.) Let K be any consistent theory with equality in the lan- 
guage , in which all recursive functions are representable and such that 
tx 0 # 1. Let Ref, be the set of Godel numbers of refutable wfs of K, that 
is, {x| Neg(x) € T,}. Prove that T, and Ref, are recursively inseparable. 


Definitions 
Let K, and K, be two theories in the same language. 


a. K, is called a finite extension of K, if and only if there is a set A of wfs 
and a finite set B of wfs such that (1) the theorems of K, are precisely 
the wfs derivable from A; and (2) the theorems of K, are precisely the 
wfs derivable from A U B. 

b. Let K, UK, denote the theory whose set of axioms is the union of the 
set of axioms of K, and the set of axioms of K,. We say that K, and K, 
are compatible if K, U K, is consistent. 


Proposition 3.49 


Let K, and K, be two theories in the same language. If K, is a finite extension 
of K, and if K, is recursively undecidable, then K, is recursively undecidable. 
Proof 


Let A be a set of axioms of K, and A U {4%, ..., .4} a set of axioms for K,. We 
may assume that .4, ...,.4, are closed wfs. Then, by Corollary 2.7, it is easy to 
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see that a wf ~ is provable in K, if and only if (4A... A.4) > 7 is provable 
in K,. Let c be a Gédel number of (4 A... A.%). Then b is a G6del number of 
a theorem of K, when and only when 23 * c + 2" * b * 2 is a Godel number of a 
theorem of K,; that is, b is in Tx, if and only if 29+ c + 2! * b * 25 is in Tx,. Hence, 
if Tx, were recursive, Tx, would also be recursive, contradicting the recursive 
undecidability of K,. 


Proposition 3.50 


Let K be a theory in the language ,. If K is compatible with RR, then K is 
recursively undecidable. 


Proof 


Since K is comptatible with RR, the theory K U RR is a consistent extension 
of RR. Therefore, by Corollary 3.46, K U RR is recursively undecidable. Since 
RR has a finite number of axioms, K U RR is a finite extension of K. Hence, by 
Proposition 3.49, K is recursively undecidable. 


Corollary 3.51 
Every true theory K is recursively undecidable. 


Proof 


K U RR has the standard interpretation as a model and is, therefore, consis- 
tent. Thus, K is compatible with RR. Now apply Proposition 3.50. 


Corollary 3.52 


Let P,; be the predicate calculus in the language »,. Then P, is recursively 
undecidable. 


Proof 


P;U RR=RR. Hence, P, is compatible with RR and, therefore, by Proposition 
3.50, recursively undecidable. 

By PF we mean the full first-order predicate calculus containing all predi- 
cate letters, function letters and individual constants. Let PP be the pure 
first-order predicate calculus, containing all predicate letters but no function 
letters or individual constants. 
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Lemma 3.53 


There is a recursive function h such that, for any wf .4 of PF having Gédel 
number u, there is a wf .4 of PP having Gédel number h(u) such that .7 is 
provable in PF if and only if .4’ is provable in PP. 


Proof 


Let ® be a wf of PF. First we will eliminate any individual constants from ©. 
Assume b is an individual constant in ® Let A}, be the first new symbol of 
that form. Intuitively we imagine that A;, represents a property that holds 
only for b. Let ©*(z) be obtained from ® by replacing all occurrences of b by z. 
We will associate with ® a new wf ¥, where ¥ has the form 


((Gz)An(z)) a (way(vy)(x=y => [An@) @ An(y)])} = (v2) An) => O#(@)] 


Then ¥ is logically valid if and only if ® is logically valid. We apply the 
same procedure to ¥ and so on until we obtain a wf ®* that contains no 
individual constants and is logically valid if and only if ® is logically 
valid. Now we apply to ®* a similar, but somewhat more complicated, pro- 
cedure to obtain a wf © that contains no function letters and is logically 
valid if and only if ® is logically valid. Consider the first function letter f;’ 
in ®%. Take the first new symbol A”"*' of that form. Intuitively we imagine 
that Ay"! holds for (x,, ..., X,,;) if and only if fj'(%1,..., Xn) =%nu- We wish 
to construct a wf that plays a role similar to the role played by above. 
However, the situation is more complex here because there may be iterated 
applications of f/’ in ©’. We shall take a relatively easy case where f/’ has 
only simple (noniterated) occurrences, say, fj'(S1,.--, 8.) and fj'(t, ..., tn). 
Let ®* be obtained from ®% by replacing the occurrences of f/"(S1, ..., Sn) 
by v and the occurrences of fj'(t,..., tn) by w. In the wf © analogous to 
Y, use as conjuncts in the antecedent (ae JGDAN Gi. ez) 
and the 1+1 equality substitution axioms for A/"', and, as the consequent 
(Vo)(Vw)(Al(S,, ..., Sp, 0) A AM (t, ..., f1,W) => O**). We leave it to the 
reader to construct © when there are nonsimple occurrences of f;’. If u is 
the Gédel number of the original wf ®, let h(u) be the Gddel number of the 
result ©. When u is not the Gédel number of a wf of PF, define h(u) to be 0. 
Clearly, h is effectively computable because we have described an effective 
procedure for obtaining © from ®. Therefore, by Church’s thesis, h is 
recursive. Alternatively, an extremely diligent reader could avoid the use 
of Church’s thesis by “arithmetizing” all the steps described above in the 
computation of h. 
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Proposition 3.54 (Church’s Theorem (1936A)) 


PF and PP are recursively undecidable. 


Proof 


a. 


By Gédel’s completeness theorem, a wf .4 of P,; is provable in P, if 
and only if vis logically valid, and is provable in PF if and only 
if vis logically valid. Hence, fp, .7 if and only if Fpp .7 However, 
the set Fmlp, of Gédel numbers of wfs of Ps; is recursive. Then 
Tp; = Tpp 0Fmlp,, where Tp, and Tp, are, respectively, the sets of 
Godel numbers of the theorems of P; and PF. If Tp; were recursive, 
Tp, would be recursive, contradicting Corollary 3.52. Therefore, PF is 
recursively undecidable. 

By Lemma 3.53, u is in Tp, if and only if h(u) is in Tpp. Since is recur- 
sive, the recursiveness of T,p would imply the recursiveness of Tp,, 
contradicting (a). Thus, Tpp is not recursive; that is, PP is recursively 
undecidable. 
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If we accept Church's thesis, then “recursively undecidable” can be replaced 
everywhere by “effectively undecidable.” In particular, Proposition 3.54 
states that there is no decision procedure for recognizing theoremhood, 
either for the pure predicate calculus PP or the full predicate calculus PF. By 
Gédel’s completeness theorem, this implies that there is no effective method for 
determining whether any given wf is logically valid. 


Exercises 


3.59 a. By a wf of the pure monadic predicate calculus (PMP) we mean 


a wf of the pure predicate calculus that does not contain predi- 
cate letters of more than one argument. Show that, in contrast to 
Church’s theorem, there is an effective procedure for determining 
whether a wf of PMP is logically valid. [Hint: Let B,, B, ..., B, be the 
distinct predicate letters in a wf .7. Then vis logically valid if and 
only if vis true for every interpretation with at most 2" elements. 
(In fact, assume 7 is true for every interpretation with at most 2* 
elements, and let M be any interpretation. For any elements b and c 
of the domain D of M, call b and c equivalent if the truth values of 
B,(b), B,(b), ..., B,(b) in M are, respectively, the same as those of B,(c), 
B,(0), ..., Bc). This defines an equivalence relation in D, and the 
corresponding set of equivalence classes has at most 2‘ members 
and can be made the domain of an interpretation M* by defining 
interpretations of B,, ..., B,, in the obvious way, on the equivalence 
classes. By induction on the length of wfs “that contain no predicate 
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3.60 


3.61 
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letters other than B,, ..., B,, one can show that is true for M if and 
only if it is true for M*. Since .7is true for M*, it is also true for M. 
Hence, .7is true for every interpretation.) Note also that whether .7 
is true for every interpretation that has at most 2" elements can be 
effectively determined.]* 


b. Prove that a wf vof PMP is logically valid if and only if vis true for 


all finite interpretations. (This contrasts with the situation in the 
pure predicate calculus; see Exercise 2.56 on page 92.) 


Ifa theory K* is consistent, if every theorem of an essentially recursively 
undecidable theory K, is a theorem of K*, and if the property Fmlx, (y) 
is recursive, prove that K* is essentially recursively undecidable. 
(Tarski et al., 1953, I) 


a. Let K bea theory with equality. If a predicate letter A;, a function 


letter fj’ and an individual constant a; are not symbols of K, then 
by possible definitions of Aj, f;', and a; in K we mean, respectively, 


expressions of the form 


Ls. (uaa es (Vn )(AF (ri8ep te ee (Minty Xn)) 


Hs “(Wie cee (Vn )(Vy) (A Conmeme os ea haa Career Xn¥)) 


iii. (WY)G; = y > 7[y) 
where 4% 7, and 7 are wfs of K; moreover, in case (ii), 
we must also have Fy(Vx,) ... (Vx,)(AW)7 Oy --+ Xw Y), 
and, in case (iii), Kx(4,y)7(y). Moreover, add to (ii) the 
requirement of nm new equality axioms of the form 
YZ ff icin Kia V Mito ey Hn) = Ff Cay cg Hit Sy Rip op Xn): 
If K is consistent, prove that addition of any possible defini- 
tions to K as new axioms (using only one possible definition 
for each symbol and assuming that the set of new logical con- 
stants and the set of possible definitions are recursive) yields 
a consistent theory K’, and K’ is recursively undecidable if and 
only if K is. 

By a nonlogical constant we mean a predicate letter, function let- 

ter or individual constant. Let K, be a theory with equality that 

has a finite number of nonlogical constants. Then K, is said to be 

interpretable in a theory with equality K if we can associate with 


* The result in this exercise is, in a sense, the best possible. By a theorem of Kalmar (1936), there 
is an effective procedure producing for each wf .4 of the pure predicate calculus another wf 
“ of the pure predicate calculus such that .4 contains only one predicate letter, a binary one, 
and such that 7 is logically valid if and only if 4 is logically valid. (For another proof, see 
Church, 1956, § 47.) Hence, by Church’s theorem, there is no decision procedure for logical 
validity of wfs that contain only binary predicate letters. (For another proof, see Exercise 4.68 
on page 277.) 
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3.62 


3.63 


3.64 


each nonlogical constant of K, that is not a nonlogical constant of 
K a possible definition in K such that, if K* is the theory obtained 
from K by adding these possible definitions as axioms, then every 
axiom (and hence every theorem) of K, is a theorem of K*. Notice 
that, if K, is interpretable in K, it is interpretable in every extension 
of K. Prove that, if K, is interpretable in K and K is consistent, and 
if K, is essentially recursively undecidable, then K is essentially 
recursively undecidable. 


Let K be a theory with equality and Aj a monadic predicate let- 
1 


ter not in K. Given a closed wf 7% let 7 “i (called the relativization of 
7 with respect to A}) be the wf obtained from ~ by replacing every 
subformula (starting from the smallest subformulas) of the form 


(Vx).4 (x) by (vx)(Aj ie re B(x )). Let the proper axioms of a new 


theory with equality K * be: (i) all wfs Au ) where + is the clo- 
sure of any proper axiom of K; (ii) (Ax) Aj (x); (iii) Aj (Gm) for each 
individual constant a,, of K; (iv) x1 =%2 => (Aj(%1)=> Aj(x)); and 
(v) Aj(ai) A... AA} (Xn) => Aj(ft'(X1, «.-, Xn) for any function letter 
of K. Prove the following. 

(4)) 


1 
a. As proper axioms of K” we could have taken all wfs 7‘), where 


is the closure of any theorem of K. 
1 
b. Kis interpretable in K. 
1 
c. K“/is consistent if and only if K is consistent. 


d. Klis essentially recursively undecidable if and only if K is (Tarski 
et al., 1953, pp. 27-28). 


K is said to be relatively aed in K’ if there is some predicate 


letter Aj not in K such that Kis interpretable in K’. If K is relatively 
interpretable in a consistent theory with equality K’ and K is essen- 
tially recursively undecidable, prove that K’ is essentially recursively 
undecidable. 


Call a theory K in which RR is relatively interpretable sufficiently 
strong. Prove that any sufficiently strong consistent theory K is essen- 
tially recursively undecidable, and, if K is also recursively axiomatiz- 
able, prove that K is incomplete. Roughly speaking, we may say that 
K is sufficiently strong if the notions of natural number, 0, 1, addi- 
tion and multiplication are “definable” in K in such a way that the 
axioms of RR (relativized to the “natural numbers” of K) are prov- 
able in K. Clearly, any theory adequate for present-day mathematics 
will be sufficiently strong and so, if it is consistent, then it will be 
recursively undecidable and, if it is recursively axiomatizable, then it 
will be incomplete. If we accept Church’s thesis, this implies that any 
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consistent sufficiently strong theory will be effectively undecidable 
and, if it is axiomatic, it will have undecidable sentences. (Similar 
results also hold for higher-order theories; for example, see Gédel, 
1931.) This destroys all hope for a consistent and complete axiomatization of 
mathematics. 


3.7 Nonstandard Models 


Recall from Section 3.1 that the standard model is the interpretation of the lan- 
guage £, of arithmetic in which: 


a. The domain is the set of nonnegative integers 

b. The integer 0 is the interpretation of the symbol 0 

c. The successor operation (addition of 1) is the interpretation of the 
function ’ (that is, of f7) 

d. Ordinary addition and multiplication are the interpretations of + 
and - 


e. The predicate letter = is interpreted by the identity relation 


By a nonstandard model of arithmetic we shall mean any normal interpretation 
M of £, that is not isomorphic to the standard model and in which all formu- 
las are true that are true in the standard model (that is, M and the standard 
model are elementarily equivalent). Also of interest are nonstandard models of 
S, that is, normal models of S that are not isomorphic to the standard model, 
and much of what we prove about nonstandard models of arithmetic also 
holds for nonstandard models of S. (Of course, all nonstandard models of 
arithmetic would be nonstandard models of S, since all axioms of S are true 
in the standard model.) 

There exist denumerable nonstandard models of arithmetic. Proof: 
Remember (page 221) that ./ is the theory whose axioms are all wfs true 
for the standard interpretation. Add a constant c to the language of arith- 
metic and consider the theory K obtained from ./ by adding the axioms 
c#7n for all numerals 7. K is consistent, since any finite set of axioms of 
K has as a model the standard model with a suitable interpretation of c. 
(Ifc #m,C # MN, ..., C#N, are the new axioms in the finite set, choose the inter- 
pretation of c to be a natural number not in {n,, ..., 1,}.) By Proposition 2.26, 
K has a finite or denumerable normal model M. M is not finite, since the 
interpretations of the numerals will be distinct. M will be a nonstandard 
model of arithmetic. (If M were isomorphic to the standard model, the inter- 
pretation c,, of c would correspond under the isomorphism to some natural 
number m and the axiom c ¥ m would be false.) 
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Let us see what a nonstandard model of arithmetic M must look like. 
Remember that all wfs true in the standard model are also true in M. So, for 
every x in M, there is no element between x and its successor x’ = x + 1. Thus, 
if z is the interpretation of 0, then z, z’, z",z’”, ..., form an initial segment z < 
z'<z" <z'" <..., of M that is isomorphic to the standard model. Let us call 
these elements the standard elements of M. The other elements of M will be 
greater than the standard elements and will be called nonstandard elements 
of M. Since every nonzero element w of M has an “immediate predecessor” u 
such that w =u’ and u will have an immediate predecessor t, and so on, every 
nonstandard element w will belong to a block B,,={..., t, u, w, w', w", ...} con- 
sisting of nonstandard elements. B,, is isomorphic to a copy of the ordinary 
integers, where w, w’, w", ..., correspond to 0, 1, 2, ..., and ..., f, u correspond 
to ..., -2, -1. More precisely, we can define a binary relation R on the set of 
nonstandard elements by specifying that x R y if and only if there is a stan- 
dard element s such that x + s = y or y +s =x. Ris an equivalence relation and 
the resulting equivalence classes are the blocks. The blocks inherit an order 
relation from M. If one element of a block B, is less than an element of a block 
B,, then every element of B, is less than every element of B,; in that case, we 
specify that B, < B,. The resulting ordering of the blocks is obviously a total 
order and it is dense and without first or last member. (See Exercise 2.67.) To 
see that there is no last member, note that, if w belongs to a block B, then 2w 
belongs to a larger block. To see that there is no first member, note that, if w 
belongs to a block B, then there exists a non-standard element x such that 
either w = 2x or w = 2x + 1, and, therefore, the block of x is smaller than B. To 
show that the ordering is dense, assume that x belongs to a block B, and that 
y belongs to a larger block B,. We may assume that x and y are even. (If x is 
not even, we could consider x + 1, and similarly for y.) Then there is a non- 
standard element z such that 2z = x + y. We leave it as an exercise to check that 
x <zand z <y and that the block of z is strictly between B, and B,. 


Exercise 3.65 


Show that, if < and <, are dense total orders without first and last element 
and their domains D, and D, are denumerable, then there is a “similarity 
mapping” f from D, onto D, (that is, for any x and y in D,, x <, y if and only 
if f(x) <, f(y)). (Hint: Start with enumerations <a, a, ...> and <b,, by, ...> of D, 
and D,. Map 4a, to b,. Then look at a). If a, >, a;, map 4, to the first unused D; 
such that b; >, b; in D,. On the other hand, if a, <, a,, map 4, to the first unused 
b; such that b; <, b; in D,. Now look at a, observe its relation to a, and a, and 
map 4; to the first unused b, so that the b’s are in the same relation as the a’s. 
Continue to extend the mapping in similar fashion.) 

Note that, with respect to its natural ordering, the set of rational num- 
bers is a denumerable totally ordered set without first or last element. So, by 
Exercise 3.65, the totally ordered set of blocks of any denumerable nonstan- 
dard model of arithmetic looks just like the ordered set of rational numbers. 
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Thus, the model can be pictured in the following way: First come the natural 
numbers 0, 1, 2, .... These are followed by a denumerable collection of blocks, 
where each block looks like the integers in their natural order, and this denu- 
merable collection of blocks is ordered just like the rational numbers. 


Exercise 3.66 


Prove that there is no wf ®(x) of the language of arithmetic such that, in each 
nonstandard model M of arithmetic, ® is satisfied by those and only those 
elements of M that are standard. (Hint: Note that ®(0) and (Vx)\(®(x~) > ®(x’)) 
would be true in M, and the principle of mathematical induction holds in M.) 

We have proved that there are denumerable nonstandard models of S, as 
well as denumerable nonstandard models of arithmetic. We may assume 
that the domain of any such model M is the set » of natural numbers. The 
addition and multiplication operations in the model M are binary opera- 
tions on w (and the successor operation is a unary operation on ®). Stanley 
Tennenbaum proved that, in any denumerable nonstandard model of arith- 
metic, it is not the case that the addition and multiplication operations are 
both recursive. (See Tennenbaum, 1959.) This was strengthened by Georg 
Kreisel, who proved that addition cannot be recursive, and by Kenneth 
McAloon, who proved that multiplication cannot be recursive, and by 
George Boolos, who proved that addition and multiplication each cannot be 
arithmetical. (Gee Kaye, 1991; Boolos et al., 2007) 


4 


Axiomatic Set Theory 


4.1 An Axiom System 


A prime reason for the increase in importance of mathematical logic in the 
twentieth century was the discovery of the paradoxes of set theory and the 
need for a revision of intuitive (and contradictory) set theory. Many differ- 
ent axiomatic theories have been proposed to serve as a foundation for set 
theory but, no matter how they may differ at the fringes, they all have as 
a common core the fundamental theorems that mathematicians require for 
their daily work. We make no claim about the superiority of the system we 
shall use except that, from a notational and conceptual standpoint, it is a 
convenient basis for present-day mathematics. 

We shall describe a first-order theory NBG, which is basically a system of 
the same type as one originally proposed by J. von Neumann (1925, 1928) 
and later thoroughly revised and simplified by R. Robinson (1937), Bernays 
(1937-1954), and Gédel (1940). (We shall follow Gédel’s monograph to a great 
extent, although there will be some significant differences.)* 

NBG has a single predicate letter A} but no function letter or individual 
constants. In order to conform to the notation in Bernays (1937-1954) and 
Gédel (1940), we shall use capital italic letters X,, X,, X3, ... as variables instead 
of X1, X, Xz, ... . (As usual, we shall use X, Y, Z, ... to represent arbitrary vari- 
ables.) We shall abbreviate A3(X,Y) by X € Y, and —A3(X,Y) by X € Y. 

Intuitively, € is to be thought of as the membership relation and the values 
of the variables are to be thought of as classes. Classes are certain collections 
of objects. Some properties determine classes, in the sense that a property 
P may determine a class of all those objects that possess that property. This 
“interpretation” is as imprecise as the notions of “collection” and “property.” 
The axioms will reveal more about what we have in mind. They will provide 
us with the classes we need in mathematics and appear modest enough so 
that contradictions are not derivable from them. 

Let us define equality in the following way. 


* IT coined the name NBG in honor of von Neumann, Bernays, and Gédel. Paul Halmos, who 
favored the Zermelo-Fraenkel system, suggested that “NBG” stood for “No Bloody Good.” 
* We use A; instead of A? because the latter was used previously for the equality relation. 
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Definition 
X=Y for(VZ\(ZeXo Zev) 


Thus, two classes are equal when and only when they have the same 
members. 


Definitions 


XcY for (VZ)\(ZeX>ZeEY) (inclusion) 
XcY for XCYAX#¥Y (proper inclusion) 


When X C Y, we say that X is a subclass of Y. When X C Y, we say that X is a 
proper subclass of Y. 
As easy consequences of these definitions, we have the following. 


Proposition 4.1* 


abrX=Ye(XCYaYCX) 
b-X=X 
cKX=YS>Y=X 
d.t#x=YsS3(Y=Z>X=Z) 


We shall now present the proper axioms of NBG, interspersing among the 
axioms some additional definitions and various consequences of the axioms. 

We shall define a class to be a set if it is a member of some class. Those 
classes that are not sets are called proper classes. 


Definitions 


M(X) for (4Y)(XeY) (Xisaset) 
Pr(X) for AM(X) (X is a proper class) 


It will be seen later that the usual derivations of the paradoxes now no lon- 
ger lead to contradictions but only yield the results that various classes are 
proper classes, not sets. The sets are intended to be those safe, comfortable 
classes that are used by mathematicians in their daily work, whereas proper 


* As usual, Z is to be the first variable different from X and Y. 
+ The subscript NBG will be omitted from ygg in the rest of this chapter. 
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classes are thought of as monstrously large collections that, if permitted to be 
sets (i.e., allowed to belong to other classes), would engender contradictions. 


Exercise 
4.1 Provek X€Y> M(x). 


The system NBG is designed to handle classes, not concrete individuals* 
The reason for this is that mathematics has no need for objects such as cows 
and molecules; all mathematical objects and relations can be formulated in 
terms of classes alone. If nonclasses are required for applications to other 
sciences, then the system NBG can be modified slightly so as to apply to both 
classes and nonclasses alike (see the system UR in Section 4.6 below). 

Let us introduce lower-case letters x,, x, ... aS special restricted variables 
for sets. In other words, (Vx)).7 (x) stands for (VX)(M(X) => .4 (X)), that is, 7 
holds for all sets, and (Ax). 4 (x;) stands for (AX)(M(X) A .4(X)), that is, 7 holds 
for some set. As usual, the variable X used in these definitions should be 
the first one that does not occur in 7 (x)). We shall use x, y, z, ... to stand for 
arbitrary set variables. 


Example 


(VX1)(Vx)(Ay)(AX3)(X1 ex A y € X3) stands for 
(VX1)(VX2)(M(X2) = Gq X4)(M(X4) A (AX3)(X1 S Xy N X4 € X3))) 


Exercise 


4.2 Prove that X = Y © (Vz)(z € X @z € Y). This is the so-called exten- 
sionality principle: two classes are equal when and only when they 
contain the same sets as members. 


Axiom T 


X, = X_ > (X, € X3 & X2 € X3) 
This axiom tells us that equal classes belong to the same classes. 


Exercise 


4.3. Prove thatt M(Z) A Z=Y > MY). 


* If there were concrete individuals (that is, objects that are not classes), then the definition 
of equality would have to be changed, since all such individuals have the same members 
(namely, none at all). 
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Proposition 4.2 


NBG is a first-order theory with equality. 


Proof 


Use Proposition 4.1, axiom T, the definition of equality, and the discussion 
on page 97. 

Note that Proposition 4.2 entails the substitutivity of equality, which will 
be used frequently in what follows, usually without explicit mention. 


Axiom P (Pairing Axiom) 


(Vx)(Vy)(dz)\(VuluEezaua=xvu=y) 
Thus, for any sets x and y, there is a set z that has x and y as its only members. 


Exercises 


4.4 Provel (Vx)(Vy)(4,\z)(Vul(u € z @ u=x Vu=y). This asserts that there 
is a unique set z, called the unordered pair of x and y, such that z has 
x and y as its only members. Use axiom P and the extensionality 
principle. 

4.5 Prove k (VX)(M(X) © (Ay)(X € y)). 

4.6 Prove (4X) Pr(X) > -(VY)(VZ)(AW)\VU\(U € We U=YVU=Z). 


Axiom N (Null Set) 


(Ax)(Vy)[y € x) 


Thus, there is a set that has no members. From axiom N and the extensionality 
principle, there is a unique set that has no members—that is, + (4,x)(Vy)(y € x). 
Therefore, we can introduce a new individual constant @ by means of the 
following condition. 


Definition 
(vy)(y € ©) 


It then follows from axiom N and Exercise 4.3 that @ is a set. 

Since we have (by Exercise 4.4) the uniqueness condition for the unordered 
pair, we can introduce a new function letter g(x, y) to designate the unor- 
dered pair of x and y. In accordance with the traditional notation, we shall 
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write {x, y} instead of g(x, y). Notice that we have to define a unique value for 
{X, Y} for any classes X and Y, not only for sets x and y. We shall let {X, Y} be 
@ whenever X is not a set or Y is not a set. One can prove 


LK (4,Z)([(AM(X)v AM(Y)) a Z=S] vIM(X)a M(Y)a (Vu)(ueZou=Xvu=Y))). 


This justifies the introduction of a term {X, Y} satisfying the following 
condition: 


[M(X) A M(Y) a (Wu)(ue {X,Y} eu=Xvu=yY) 
VI(=M(X) v =M(Y)) A {X,Y} =] 


One can then prove F (Vx)(Vy)(Vuu € {x, y} @u=xVu=y)andt (VX/VY) 
M({X, Y}). 


Definition 
{X} for {X, X} 


For a set x, {x} is called the singleton of x. It is a set that has x as its only 
member. 

In connection with these definitions, the reader should review Section 2.9 
and, in particular, Proposition 2.28, which assures us that the introduction 
of new individual constants and function letters, such as @ and {X, Y}, adds 
nothing essentially new to the theory NBG. 


Exercise 


4.7 a. Provet {X, Y} ={Y, X}. 
b. Prove k (Vx)(Vy)({x} = {y} = x = y). 


Definition 
(X,Y) for {{X}, {X, Y}} 


For sets x and y, (x, y) is called the ordered pair of x and y. 

The definition of (X, Y) does not have any intrinsic intuitive meaning. It 
is just a convenient way (discovered by Kuratowski, 1921) to define ordered 
pairs so that one can prove the characteristic property of ordered pairs 
expressed in the following proposition. 
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Proposition 4.3 


Lt (vx)(Vy)(Vu)(Vo)((x,y) =(u,0) > x=uUAy =D) 


Proof 

Assume (x, y) = (u, v). Then {{x}, {x, y}} = {{u}, {u, v}}. Since {x} © {{x}, {x, y}}, 
{x} € {{u}, {u, v}}. Hence, {x} = {u} or {x} = {u, v}. In either case, x = u. Now, {u, v} 
E {{u}, {u, v}}; so, {u, v} € {{x}, Me y}}. Then {u, v} = {x} or {u, v} = {x, y}. Similarly, 
{x, y} = {u} or {x, y} = {u, v}. If {u,v oe }and {x, y} = {u}, thenx =y=u=0; 


if not, {u, v} = {x, y}. Hence, {u, v} = 
y =v. Thus, in all cases, y = v. 

Notice that the converse of Proposition 4.3 holds by virtue of the substitu- 
tivity of equality. 


, y}. So, if v # u, then y = v; if v =u, then 


Exercise 


4.8 a. Show that, instead of the definition of an ordered pair given in the 
text, we could have used (X, Y) = {{@, X}, {@}, Y}}; that is, Proposition 
4.3 would still be provable with this new meaning of (X, Y). 


b. Show that the ordered pair also could be defined as {{@, {X}}, {{Y}}}. 
(This was the first such definition, discovered by Wiener (1914). For a 
thorough analysis of such definitions, see A. Oberschelp (1991).) 


We now extend the definition of ordered pairs to ordered n-tuples. 


Definitions 
{X) =X 
(Xi, chs Dey Xn) = (X1, tee Xn), Xn) 


Thus, (X, Y, Z) = (X, Y), Z) and (X, Y, Z, U) = (KX, Y), Z), U). 
It is easy to establish the following generalization of Proposition 4.3: 


a Oc eer eos | alr eed cas) (Capper 7) <0) freee Ps 


Xp =H YN... AXy = Yn) 


Axioms of Class Existence 


(B1) (AX)(Vu)(Vo)(u, v) € X Su EV) (€-relation) 
(B2) (VX)\VY)\(AZ)\(WulueZSuEXAUEY) (intersection) 
(B3) (VX)AZ)\(Vulu Ee Zou € X) (complement) 
(B4) (VX)(AZ)(Vuyu € Z = (Av)(u, v) € X)) (domain) 
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(B5) (VX)(AZ)(Vu)(Vo)\((u, v) EZ Su EX) 
(B6) (VX)(AZ)(Vu)(Vo)(Vw)(u, v, w) EZ & (uy, w, U) € X) 
(B7) (VX)(AZ)(Vu)(Vo)(Vw)(u, v, w) EZ & (u, W, Vv) € X) 


From axioms (B2)—(B4) and the extensionality principle, we obtain: 


F(VX)(VY)(UZ)(Vu)(ueZ Sue XaueY) 


L(VX)(5iZ)(Vu)(ueZ <u ¢ X) 


 (VX)(4Z)(Wu)(u € Z <> (3v)((u,v) € X)) 


These results justify the introduction of new function letters: n,-, and 7. 


Definitions 


(Vuj\uexXnY SueXaueyY) (intersection of X and Y) 
(Vu)(ue X <>u¢X) (complement of X) 
(Vu)(ue 7(X) & (A0)((u,v) € X)) ~~ (domain of X) 
XUY=XaAY (union of X and Y) 
V=©@ (universal class)* 
X-Y=XnY (difference of X and Y) 


Exercises 


4.9 Prove: 

a F(WwUEXUYSuEexvuey) 

b. (Wu) € V) 

c EF(WWuUEX-YeuEexXau€yY) 
4.10 Prove: 

a FXNY=YnX 
-FXUY=YUX 
-rFXCYSexnY=xX 
FXCYSXuUY=Y 
F(XNY)NZ=Xn(YnZ) 
F(XUY)UZ=XU(YUZ) 


moan & 


* It will be shown later that V is a proper class, that is, V is not a set. 
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L(XnX)=X 
LXUX=X 


T 


I P< be NS be OX 
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=>X-Y=X 
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I 
Q mx &X 


T 
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II 


FE XN(YUZ)=(KNY)U(KNZ) 
FXU(YNZ)=(KUY)N(XUZ) 


4.11 Prove the following wfs. 


a. 


b. 


. & (WX)(AZ)(V0,) ... (V0,,)(VX1) -.. (WX, )((X, «4, X 


F (WX)(AZ)(Wu)(Vo)((u, v) € Z & (, u) € X) [Hint: Apply axioms (B5), 
(B7), (B6), and (B4) successively.] 

F (WX)(AZ)(Vu)(Vo)(Vw)((u, v, w) € Z & (u, w) € X) [Hint: Use (B5) 
and (B7).] 

F (VX)(AZ)(Vo)(Vx,) ... (WX, Q(X, «2 Xp V) E ZS (Ky, ..., X,) © X) [Hint: 
Use (B5).] 

w Uy v7 Om) ELS 
(Xy ... ,X,) © X) [Hint: Iteration of part (c).] 

EF (VX)(AZ)(V0,) «2. (WO, (VX) «WX, oe Xp Vy ver Un Xp) EZ OS 
(X1 «++, X,) © X) [Hint: For m = 1, use (b), substituting (x1, ..., X,4) for 
u and x, for w; the general case then follows by iteration.] 

F (VWX)(AZ)(Vx)(VO,) ... (WO (Oy, «+ Un X) € Z <> x € X) [Hint: Use (B5) 
and part (a).] 

F (VX)(AZ)(Vx1) «0. (WX (Xp 4 Xp) € ZS (AY)((X1, «--, Xp, Y) © X) [Hint: 
In (B4), substitute (x,, ..., x,) for u and y for v.] 

F (WX)(AZ)(Vu)(Vo)(Vw)((v, u, w) € Z = (u, w) € X) [Hint: Substitute 
{u, W) for u in (B5) and apply (Bé6).] 

F (VX)\(AZ)(V0,) ... (Vo, (Vu)(Vu) (01, «.-, Oy U, W) € Z & Cu, wW) € X) 
[Hint: Substitute (v1, ..., v,) for v in part (h).] 


Now we can derive a general class existence theorem. By a predicative wf we mean 
a wf @(X,, ..., Xv Yy «+ Y;,) whose variables occur among Xy, ..., Xw Yu. + Yn 


Axiomatic Set Theory 239 


and in which only set variables are quantified (Le., p can be abbreviated in such 
a way that only set variables are quantified). 


Examples 


(A4x,)(x, € Y;) is predicative, whereas (AY,)(x, € Y;) is not predicative. 


Proposition 4.4 (Class Existence Theorem) 


Let p(X, ..., X,, Yy ..., Y,,) be a predicative wf. Then  (AZ)(Vx,) ... (Vx,) 
(X40 Xp) EZ D> Oy, oe ar Vor over Yon) 


Proof 


We shall consider only wfs @ in which no wf of the form Y; € W occurs, since 
Y; € Wcan be replaced by (Ax)(x« = Y; Ax € W), which is equivalent to (Ax) [(VW2)(z € x 
=z €Y) Ax € WI. Moreover, we may assume that @ contains no wf of the form 
X € X, since this may be replaced by (Au)(u = X A u € X), which is equivalent to 
(au) [(W2)z € uz € X) Aue X]. We shall proceed now by induction on the num- 
ber k of connectives and quantifiers in @ (written with restricted set variables). 


Base: k = 0. Then g has the form x; € x; or x; € x; or x; € Y, where 1 <i<j <n. 
For x; € x;, axiom (B1) guarantees that there is some W, such that (Wx))(Vx)((x;, 
xj) € W, & x; € x). For x; € x;, axiom (B1) implies that there is some W, such 
that (Vx))(Wx,)((x;, x)) € W> = x; € x) and then, by Exercise 4.11(a), there is some 
W; such that (Vx)(x)((x;, x) € W3; = x; € x;). So, in both cases, there is some 
W such that (Vx;)(Wx)((xj, xj) € W > Q(X, «+1 Xa Yay -++7 Yu). Then, by Exercise 
4.11) with W = X, there is some Z, such that (Vx) ... (Wx aMWVx)VX)((Xy 
Xiay Xz Xj) € Zy > P(X, «++ Xpr Voy «++» Yun) Hence, by Exercise 4.11(e) with Z, = X, 
there exists Z, such that (Vx,) ... (Wx)(VXin1) --- (Wx) (ty, «1 X)) € Zo S Oy, --, 
Xn Vy «++ Y;,)). Then, by Exercise 4.11(d) with Z, = X, there exists Z such that 
(Wx1) ... (VX, M(Xy 1 Xp) E ZS P(X, «4 Nw Yu «++ Y,,)). In the remaining case, x; 
€ Y,, the theorem follows by application of Exercise 4.11(f, d). 


Induction step. Assume the theorem provable for all k < r and assume that ~ 
has r connectives and quantifiers. 


a. Mis 7. By inductive hypothesis, there is some W such that (Vx,) ... (Vx,,) 
(Xoo Xp) EWS Wy, 00 Xv Vy «eer Yq) Let Z = W. 

b. g is yw > 8%. By inductive hypothesis, there are classes Z, and Z, such 
that (Vx,) ... (Wx, )((Xy, «1 Xp) EZ, S Wy, 0 Xa Yur oe Yq) ard (V2) «.. 
(WX, M(Xyp «2 Xp) E Loy OK, 20s, Xa Voy oy Yy,)). Let Z = Z AZp. 

c. @ is (Vx)p. By inductive hypothesis, there is some W such that (Vx,) ... 
(Wx, )WX)QX py oe Xp X) EW S W(X, -- Ky LY «+7 Yy)) Apply Exercise 


4.11(g) with X = W to obtain a class Z, such that (Vx,) ... (Wx, (ty, «++ Xp) 


EZ, & (AX) 7 Why «6 Xu %, Vy, «+ Yn) Now let Z = Z, noting that (Vx)y 
is equivalent to =(4x)-y. 
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Examples 


1. Let p(X, Y;, Y2) be (Au)(Av)(X = tu, v) Au € Y, Av € Y;). The only 
quantifiers in @ involve set variables. Hence, by the class existence 
theorem, F (AZ)(Vx)(x € Z & (Au)(v\(x = (u, 1) AU EY, AU € Y,)). By 
the extensionality principle, 


F(A,Z)(Vx)(x eZ = (Au)(dv)(x = (u, v) AWE Y, AVE Y2). 


So, we can introduce a new function letter x. 


Definition 
(Cartesian product of Y, and Y;) 


(Vx)(x € Y, x Yo & (Au)(v)(x = (u,v) Aue YY AV € Y2)) 


Definitions 


ge for Yx Y 
¥" for Y"'xY whenn>2 
Rel(X) forXcV?  (Xisa relation)* 


V? is the class of all ordered pairs, and V" is the class of all ordered n-tuples. 
In ordinary language, the word “relation” indicates some kind of connection 
between objects. For example, the parenthood relation holds between parents 
and their children. For our purposes, we interpret the parenthood relation to 
be the class of all ordered pairs (u, v) such that u is a parent of v. 


2. Let p(X, Y) be X C Y. By the class existence theorem and the extension- 
ality principle, + (4,Z)(Vx)(« € Z = x € Y). Thus, there is a unique class Z 
that has as its members all subsets of Y. Z is called the power class of Y 
and is denoted P(Y). 


Definition 


(vx)(x eP(Y)ox cY) 


3. Let p(X, Y) be v)(X € v Av € Y)). By the class existence theorem and 
the extensionality principle,  (4,Z)(Vx\(x € Z & (v(x Ev AVE Y)). 


* More precisely, Rel(X) means that X is a binary relation. 
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Thus, there is a unique class Z that contains all members of members 
of Y. Z is called the sum class of Y and is denoted (J Y. 


Definition 


(Vx)(x eUY = (Av)(x evave Y)) 


4. Let p(X) be (Au)(X = (u, u)). By the class existence theorem and the exten- 
sionality principle, there is a unique class Z such that (Vx)(x € Z © (Au) 
(x = (u, u))). Z is called the identity relation and is denoted I. 


Definition 
(Vx)(x eI <> (Au)(x = (u,u))) 
Corollary 4.5 


If p(X, ..., Xv Vy «--, Y;,) is a predicative wf, then 


LK (AW)(W CU" A(Yx1) 22. (Vn) (Cty 27 En) EW > O( 1, 22) Zur Yas es Yn) 


Proof 


By Proposition 4.4, there is some Z such that (Vx) ... (Wx, (Xp. X,) EZ S 
P(X 0 Xv Vy + Y,)). Then W = Zn V" satisfies the corollary, and the unique- 
ness follows from the extensionality principle. 


Definition 


Given a predicative wf @(X1, ..., Xiv Yup oor Vad let (Xa, 2 Xp) | Py 0 Xa Voy oes 
Y,)} denote the class of all n-tuples (x,, ..., x,) that satisfy Q(X) ..., Xv Yo 
Y,,); that is, 


(Vu)(u € {(x;, 0.0, Xn) | OX, 00 Xn Mi, eer Yin) S 


(4x1)... (Axn)(U = (x1, 00 Xn) A(X, «0, Xn Vi, «++ Yn))) 


This definition is justified by Corollary 4.5. In particular, when n = 1, F (Vu) 
(u € {x| 9%, Yy -- Yad} S OU, Yo, «++ Yun) 
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Examples 


1. Take @ to be (x2, x,) € Y. Let Ybean abbreviation for {(x1, 9) | (Xp, X1) € Y}. 
Hence, Y < V? A(VX1)(VX2)((X1, X2) € Yo (xX2,%1) € Y). Call Y the inverse 


2. Take @ to be (Av)((z, x) € Y). Let .7(Y) stand for {x|(Av)((u, x) € Y)}. Then 
F (Wuy(u € 7(Y) & (Av)((v, x) € Y))..7(Y) is called the range of Y. Clearly, 


Notice that axioms (B1)-(B7) are special cases of the class existence theo- 
rem, Proposition 4.4. Thus, instead of the infinite number of instances of the 
axiom schema in Proposition 4.4, it sufficed to assume only a finite number 


relation of Y. 


E AY) =f (Y). 


of instances of that schema. 


Exercises 


4.12 


Prove: 


ee 


PW moan oF Pp 


PH ee OS Be ree 


+UG=fo 

+ U {a} = 

HKUV=V 

L(V) = 
EXCYSUXCUYaA*&)CAY) 
KU »(X) = 

FXC.7(UX) 

k (XN Y) x (WZ) =(Xx W) nN (Y x Z) 
KF (XUY)x(WUZ)=(Kx W) U(X x Z)U(Y x W) nN (Y x Z) 
k AXNY) = AX) AY) 

Fk A(X) U AY) © AXUY) 

t Rel(Y) > YC 7(Y) x ”(Y) 
FUKUN=(UXNU(UY) 
FUKnyY)c(UX (UY) 
bZ=YS>Z=YoV’ 

+ Rel(I) AI =1 

F 7(@) = 

F 7({@}) = {©, {O}} 

F (Wx)(Vy)e x y C.(4@U y)) 


What simple condition on X and Y is equivalent to .7(X UY) C 7 


(X) UA)? 
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Until now, although we can prove, using Proposition 4.4, the existence of a 
great many classes, the existence of only a few sets, such as @, {@}, {@, {@}}, 
and {{@}}, is known to us. To guarantee the existence of sets of greater com- 
plexity, we require more axioms. 


Axiom U (Sum Set) 
(Vx)(Gy)(Vu)(u € y & (Av)(u evAVEX)) 


This axiom asserts that the sum class [J x of a set x is also a set, which we 
shall call the sum set of x, that is, F(V x)M( [J x). The sum set [J x is usu- 
ally referred to as the union of all the sets in the set x and is often denoted 


Noex?. 


Exercises 


4.13 Prove: 
a. H(vx)(Vy)(U fx, y}= xu y) 
b. ne Uy) 
c. E(¥x)( LU {x} = x) 
d. F(vx)(vy)( u (x,y) = 1%, y)) 
4.14 Define by induction {x,, ..., x,} to be {x,, ..., X,,} U {x,}. Prove F (Wx) ... 


(vx, )(Wu)tu € {x1 ...,X,} @u=x,V...Vu=x,) Thus, for any sets x1, ...,Xy, 
there is a set that has x, ..., x,, as its only members. 


Another means of generating new sets from old is the formation of the set 
of subsets of a given set. 


Axiom W (Power Set) 
(Vx)(y)(Vuluey oucx) 


This axiom asserts that the power class .7(x) of a set x is itself a set, that is, 
FE (Wx)M(7 (x). 

A much more general way to produce sets is the following axiom of 
subsets. 


Axiom S (Subsets) 


(Vx)(VY\Az)\(VujuezauexaueyY) 
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Corollary 4.6 


a. F (Vx)(VY) M(x n Y) (The intersection of a set and a class is a set.) 
b. F (Vx)(VY)(Y € x > M(Y)) (A subclass of a set is a set.) 
c. For any predicative wf .7(y),F (VWM(y|y € x A .7(y))). 


Proof 


a. By axiom S, there is a set z such that (Vu)(uezeueEx Aue Y), which 
implies (Vu)(uEezeuexnyY). Thus,z=xnY and, therefore, xn Y is a 
set. 

b. If Y € x, then xn Y = Y and the result follows by part (a). 

c. Let Y = {y|y © x A F(y)}* Since Y € x, part (b) implies that Y is a set. 


Exercise 


4.15 Prove: 
a. F (Wx)(M(z (x) A M(-7 (2). 
b. - (Wx)(Vy) M(x x y). [Hint: Exercise 4.12(s).] 
c. FE M(Z(Y)) AM(A(Y)) A Rel(Y) > M(Y). [Hint: Exercise 4.12(t).] 
d. FPrY) AY CX = Pr(X). 


On the basis of axiom S, we can show that the intersection of any nonempty 
class of sets is a set. 
Definition 


X for {y|(Vx)(x eX => yex)} (intersection) 


Proposition 4.7 


a. H(Vx(x Ee X > f) XCx) 
b. EX # @ > M(() X) 
cE (I) G=V 


Proof 
a. Assume u € X. Consider any y in (| X. Then (Vx)(x € X > y € x). Hence, 
y Eu. Thus, () X Cu. 


* More precisely, the wf Y € X A .7(Y) is predicative, so that the class existence theorem yields 
aclass {y|y € X A .7(y)}. In our case, X is a set x. 
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b. Assume X # @. Let x € X. By part (a), (] X € x. Hence, by Corollary 
4.6(b), () X is a set. 

c. Since F (Vx)(x € @), F (Wy)(Vx)\(x € @ > y € x), from which we obtain H(V y) 
(y €(] @). From + (Vy)(y € V) and the extensionality principle, + (| @ = V. 


Exercise 


4.16 Prove: 


a Ff) {yyb=xfly 
b. Ff) {x}=x 
c FXCYS>/f\YC{()X 


A stronger axiom than axiom S will be necessary for the full development of 
set theory. First, a few definitions are convenient. 


Definitions 


Fne(X) for Rel(X) A (Wx)(Vy)(Vz)((x, y) € X A (x, z) EX Sy =Z) (Xisa 


function) 
XY —>Z_~ for Fne(X) A 9 (X) =Y A-A(X) CZ (X is a function from Y into Z) 
YIx for Xn (Y x V) (restriction of X to the domain Y) 


Fne(X) — for Fne(X) a Fne(X) (X is a one-one function) 
z if(Vuy(Y,u) eX Su=z) 

© otherwise 

X’Y = 7(YIX) 


xX'Y = 


If there is a unique z such that (y, z) € X, then z = X’y; otherwise, X’y = ©. 
If X is a function and y is a set in its domain, X’y is the value of the function 
applied to y. If X is a function, X”Y is the range of X restricted to Y* 


Exercise 


4.17 Prove: 
a. F Fne(X) Ay € A(X) > (V2)(X'y =z > (y, Zz) € X) 
b. & Fne(X) A Y € AX) > Fne(YIX) AAYIX) = YA (Wy)y € Y > X'y = 
(YIX)'y) 
c. - Fne(X) > [Fne(X) & (Wy)(W2)(y € AX) AzZE A(X) Ay #Z> Xy X72) 
d. F Fne(X) AY € A(X) > (Wa) € X"Y © (Ay)(y € Y A X'y = 2) 


* In traditional set-theoretic notation, if F is a function and y is in its domain, F’y is written as 
F(y), and if Y is included in the domain of F, F”Y is sometimes written as F[Y]. 
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Axiom R (Replacement) 


Fne(Y) => (Vx)(dy)(Vu)(u € y & (Av)((o,u) € YAvex)) 


Axiom R asserts that, if Y is a function and x is a set, then the class of second 
components of ordered pairs in Y whose first components are in x is a set (or, 
equivalently, “(x!Y) is a set). 


Exercises 


4.18 Show that, in the presence of the other axioms, the replacement axiom 
(R) implies the axiom of subsets (S). 


4.19 Prove Fne(Y) > (Vx)M(Y"x). 
4.20 Show that axiom R is equivalent to the wf Fnc(Y) A M(7(Y)) > M(A(Y)). 


4.21 Show that, in the presence of all axioms except R and S, axiom R is 
equivalent to the conjunction of axiom S and the wf Fne,(Y) A M(7 (Y)) 
=> M(A(Y)). 
To ensure the existence of an infinite set, we add the following axiom. 


Axiom I (Axiom of Infinity) 


(Ax\(Pexa(Vu)(ue x > uv {uj} ex)) 


Axiom I states that there is a set x that contains @ and such that, whenever 
a set u belongs to x, then u U {u} also belongs to x. Hence, for such a set x, 
{O} € x, {S, {O}} € x, {S, {O}, {B, {O}} € x, and so on. If we let 1 stand for {@}, 
2 for {@, 1}, 3 for {@, 1, 2}, ...,n for {@, 1, 2, ...,n — lj, etc., then, for all ordinary 
integersn>0,ne€x,and@41,042,142,043,143,243,.... 


Exercise 


4.22 a. Prove that any wf that implies (4X)M(X) would, together with axiom 
S, imply axiom N. 
b. Show that axiom I is equivalent to the following sentence (I*): 


(Ax\(Gyy exa(vuj(u  y)) a (Vu)(u € x > UU {uj € x)) 


Then prove that (I) implies axiom N. (Hence, if we assumed (I*) 
instead of (I), axiom N would become superfluous.) 


This completes the list of axioms of NBG, and we see that NBG has only 
a finite number of axioms—namely, axiom T, axiom P (pairing), axiom 
N (null set), axiom U (sum set), axiom W (power set), axiom S (subsets), 
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axiom R (replacement), axiom I (infinity), and the seven class existence axi- 
oms (B1)-(B7). We have also seen that axiom S is provable from the other axi- 
oms; it has been included here because it is of interest in the study of certain 
weaker subtheories of NBG. 

Let us verify now that the usual argument for Russell’s paradox does not 
hold in NBG. By the class existence theorem, there is a class Y = {x|x ¢ x}. 
Then (Vx)(x € Y @ x € x). In unabbreviated notation this becomes (VX)(M(X) 
=> (Xe Y eX ¢ X)). Assume M(Y). Then Y € Y = Y ¢ Y, which, by the tau- 
tology (A = 7A) > (A A WA), yields Y € Y A Y ¢ Y. Hence, by the derived 
rule of proof by contradiction, we obtain ! -M(Y). Thus, in NBG, the argu- 
ment for Russell’s paradox merely shows that Russell’s class Y is a proper 
class, not a set. NBG will avoid the paradoxes of Cantor and Burali-Forti in 
a similar way. 


Exercise 


4.23 Prove F =M(V), that is, the universal class V is not a set. [Hint: Apply 
Corollary 4.6(b) with Russell's class Y.] 


4.2 Ordinal Numbers 


Let us first define some familiar notions concerning relations. 


Definitions 


X Irr Y for Rel(X) A (Vy)\(y EY => (y, y) €X) 

(X is an irreflexive relation on Y) 
X Tr Y for Rel(X) A (Vu)(Vo)(Vw)((u EYAv EYAWEYA 

(u,v) EX A (, w) € X] > Cu, w) € X) 

(X is a transitive relation on Y) 
X Part Y for (X Irr Y) A (X Tr Y) (X partially orders Y) 
X Con Y for Rel(X) A (Vu)(Vo)\(uE YAVEY Au’ > u,v) EX V Ww, u) EX) 

(X is a connected relation on Y) 
X Tot Y for (X Irr Y) A (X Tr Y) A (X Con Y) (X totally orders Y) 
X We Y for (X Irr Y) A (VZ)\([Z CYAZ 4] > (Ay(yEZa 

(Wwo)\vEZAv#Y>= ly, v0) EXA WY) € X))) 


(X well-orders Y, that is, the relation X is irreflexive on Y and every nonempty 
subclass of Y has a least element with respect to X). 
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Exercises 


4.24 Prove X We Y = X Tot Y. [Hint: To show X Con Y, letx EYAYEYA 
x #y. Then {x, y} has a least element, say x. Then (x, y) € X. To show X 
Tr Y,assumex EYAYEYAZEYA (x,y) EXA (y,Z) € X. Then {x, y, Z} 
has a least element, which must be x.] 


4.25 Provek XWeYAZCY=>X WeZ. 


Examples (from intuitive set theory) 
1. The relation < on the set P of positive integers well-orders P. 


2. The relation < on the set of all integers totally orders, but does not well- 
order, this set. The set has no least element. 


3. The relation C on the set W of all subsets of the set of integers par- 
tially orders W but does not totally order W. For example, {1} ¢ {2} and 


{2} ¢ {1}. 


Definition 


Simp(Z, W,, W,) for (4x,)(Ax,)(4r,)Gr,)(Rel(r,) A Rel(r,) A Wy = (ry, X)) A Wy = 
(fo, Xo) A Fne(Z) A 9(Z) =x, A AZ) =x, A (Vul(Vo\u €x,A0E x, > (u,v) €7, 
@ (Z'u, Z'V) €1,))) 

(Z is a similarity mapping of the relation r, on x, onto the relation r, on x,.) 


Definition 


Sim(W,, W,) for (4z)Simp(z, W,, W2) 


(W, and W) are similar ordered structures) 


Example 

Let r, be the less-than relation < on the set A of nonnegative integers {0, 1, 
2, ...}, and let r, be the less-than relation < on the set B of positive integers 
{1, 2, 3, ...}. Let z be the set of all ordered pairs (x, x + 1) forx € A. Then z isa 
similarity mapping of (r,, A) onto (1, B). 


Definition 


Xy ° Xo for {(u, v) | (Az)(u, Z) € Xo A {(Z,0) € X,)} 
(the composition of X2 and X;) 


Axiomatic Set Theory 249 


Exercises 


4.26 Prove: 
a. - Simp(Z, X, Y) > M(Z) A M(X) A M(Y) 
b. + Simp(Z,X,Y)> Simp(Z,Y,X) 

4.27 a. Prove: + Rel(X,) A Rel(X,) > Rel(X, 0 X,) 


b. Let X, and X, be the parent and brother relations on the set of human 
beings. What are the relations X, 0 X, and X, 0 X,? 


c. Prove: F Fne(X,) A Fne(X,) > Fne(X, o X>) 
d. Prove: + Fne,(X,) A Fne,(X,) > Fne,(X, o X,) 
e. Prove: (Xi: Z > WAX, Y> Z) > X,0X, YW 


Definitions 


Fld(X)for 7(X)U #(X) (the field of X) 
TOR(X)for Rel(X) A (X Tot (Fld(X))) | (X is a total order) 
WOR(X)for Rel(X) A(X We (Fld(X)))  (X is a well — ordering relation) 


Exercise 


4.28 Prove: 


a. + Sim(W,, W>) > Sim(W,, W,) 

b. E Sim(W,, W,) A Sim(W,, W;) > Sim(W,, W3) 

c. & Sim((X, Fld(X)), (Y, Fld(Y))) > (TOR(X) & TOR(Y)) A (WOR(X) & 
WOR(Y)) 


If x is a total order, then the class of all total orders similar to x is called the 
order type of x. We are especially interested in the order types of well-ordering 
relations, but, since it turns out that all order types are proper classes (except 
the order type {©} of @), it will be convenient to find a class W of well-ordered 
structures such that every well-ordering is similar to a unique member of W. 
This leads us to the study of ordinal numbers. 


Definitions 


E for {(x, y)|x € y} (the membership relation) 

Trans(X) for (Vu(ue X>uC xX) (Xis transitive) 

Sect\(X, Z) for ZC X A (Wu(Vo)\((UE XAVEZA Uv) EY] S>ueEeZ) 
(Z is a Y-section of X, that is, Z is a subclass of X and every 
member of X that Y-precedes a member of Z is also a member 
of Z.) 
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Seg (X, W) for {x|x € X A (x, W) € Y} (the Y-segment of X determined by W, 
that is, the class of all members of X that Y-precede W). 


Exercises 


4.29 Prove: 
a. - Trans(X)  (Wu)(Vo)\v Eu NUE X>VEX) 
HTrans(X)@ [J X ¢ X 
+ Trans(@) 
+ Trans({@}) 
+ Trans(X) A Trans(Y) > Trans(X U Y) A Trans(X n Y) 
tTrans(X) = Trans(\J X) 
g. H(V u)\(u € X > Trans(u)) > Trans(J X) 
4.30 Prove: 
a. - (Wu) [Seg.(X, u) = XN uA Meg; (X, u))] 
b. - Trans(X) © (Vu)(u € X > Seg,(X, u) = u) 
c. FE WeX a Sect,(X, Z) AZ # X => (Aul(u € X A Z = Seg, (X, u)) 


moana & 


Definitions 

Ord(X) for E We X A Trans(X) (X is an ordinal class if and only if the 
€-relation well — orders X and any mem- 
ber of X is a subset of X) 

On for {x|Ord(x)} (The class of ordinal numbers) 


Thus, F (Vx)(x € On & Ord(x)). An ordinal class that is a set is called an ordi- 
nal number, and On is the class of all ordinal numbers. Notice that a wf x € 
On is equivalent to a predicative wf—namely, the conjunction of the follow- 
ing wfs: 


a. (Vuluex >u Eu) 
b. WWUCxAUFA G> AvvEeuraVulweunw#v>vEwdaw Ed) 
c. (WuluEex>uCx) 


(The conjunction of (a) and (b) is equivalent to E We x, and (c¢) is Trans(x).) In 
addition, any wf On € Y can be replaced by the wf (Ay)\(y Ee YA (WAZEYS 
z € On)). Hence, any wf that is predicative except for the presence of “On” is 
equivalent to a predicative wf and therefore can be used in connection with 
the class existence theorem. 
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Exercise 


4.31 


Prove: (a) k @ € On. (b) F 1 € On, where 1 stands for {@}. 


We shall use lower-case Greek letters «, B, y, 5, t, ... as restricted variables 
for ordinal numbers. Thus, (Va).7(a) stands for (Vx)(x € On > .7 (x), and (4a).7 
(a) stands for (Ax)(x € On A .4 (x). 


Proposition 4.8 


an» 


-F Ord(X) > (XEXAWUUEX>Uu EU) 

.F Ord(X) AY Cc X A Trans(Y) > Ye X 

. F Ord(X) A Ord(Y) > (YC XS YE X) 

.F Ord(X) A Ord(Y) > (XE YVX=YVYEX)AAXEYAYEX)A 


aAXEYAX=Y)] 


e. Fk Ord(X) AYE X > YEOn 
f. KE We On 
g. K Ord(On) 
h. F > M(On) 
i. HK Ord(X) > X=Onv XE On 
j. }y C Ona Trans(y) > y € On 
k.FxeOnaAye Ons (x Cyvy Ex) 
Proof 
a. If Ord(X), then E is irreflexive on X; so, (Vu)(u € X > u € u); and, if X € X, 


b. 


ie) 


X €& X. Hence, X € X. 
Assume Ord(X) A Y c X A Trans(Y). It is easy to see that Y is a proper 
E-section of X. Hence, by Exercise 4.30(b, c), Y € X. 


. Assume Ord(X) A Ord(Y). If Y € X, then Y € X, since X is transitive; but 


Y # X by (a); so, Y C X. Conversely, if Y c X, then, since Y is transitive, 
we have Y € X by (b). 


. Assume Ord(X) A Ord(Y) A X # Y. Now, XN Y C X and XN Y CY. Since 


X and Y are transitive, sois XN Y.If XN Yc Xand XN Y CY, then, by 
(b), XNY € Xand XN Y EY; hence, XN Y € XN Y, contradicting the 
irreflexivity of E on X. Hence, either Xn Y = X or XN Y= Y; that is, X C Y 
or Y € X. But X # Y. Hence, by (0), X € Y or Y € X. Also, if X € Y and 
Y € X, then, by (c), X C Y and Y C X, which is impossible. Clearly, 
X €YAX=Y is impossible, by (a). 


. Assume Ord(X) A Y € X. We must show E We Y and Trans(Y). Since 


Y € X and Trans(X), Y c X. Hence, since E We X, E We Y. Moreover, 
ifu € Y and v € u, then, by Trans(X), v € X. Since ECon Xand YE XA 
vexX,thenve Yvv=YVYeEulf either v = Y or Y €z, then, since 
ETr Xandue€ Y Av €u, we would have u € u, contradicting (a). Hence 
v € Y. So, if u € Y, then u C Y, that is, Trans(Y). 
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By (a), E Irr On. Now assume X € On A X # @. Let a € X. If wis the least 
element of X, we are done. (By least element of X we mean an element 
vin X such that (Wu(ue XAuFV>vV Eun), If not, then EWeaand Xn 
a # @; let B be the least element of XN «. It is obvious, using (d), that B is 
the least element of X. 

We must show E We On and Trans(On). The first part is (f). For the 
second, if u € On and v € u, then, by (e), v € On. Hence, Trans(On). 

If M(On), then, by (g), On € On, contradicting (a). 

Assume Ord(X). Then X € On. If X 4 On, then, by (), X € On. 
Substitute On for X and y for Y in (b). By (h), y C On. 

Use parts (d) and (c). 


We see from Proposition 4.8(i) that the only ordinal class that is not an ordi- 
nal number is the class On itself. 


Definitions 


X<oy for xeOnayeOnaxey 
xSoy for yeOnan(x=yvx<, y) 


Thus, for ordinals, <, is the same as €; so, <, well-orders On. In particular, 
from Proposition 4.8(e) we see that any ordinal x is equal to the set of smaller 
ordinals. 


Proposition 4.9 (Transfinite Induction) 


I (VB)|(Va)(aeB=>aeX)=>BeX]>OncX 


(If, for every B, whenever all ordinals less than f are in X, 6 must also be in X, 
then all ordinals are in X.) 


Proof 


Assume (Vf) [(Va)(a € B > a € X) => B € X]. Assume there is an ordinal in 
On - X. Then, since On is well-ordered by E, there is a least ordinal B in On — X. 
Hence, all ordinals less than f are in X. So, by hypothesis, B is in X, which is 
a contradiction. 

Proposition 4.9 is used to prove that all ordinals have a given prop- 


erty 


Za). We let X = {x|.4(x) A x € On} and show that (Vp)[(Va\(a € B > 


AM) > AB). 
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Definition 
x’ for x U {x} 
Proposition 4.10 


a. F (Vx)\(x € On & x’ € On) 
b. F (Va) =(AB)(a <, B <, @’) 
c. F (Wo)(VB)(a’ = B’ > a = B) 


Proof 


a. x € x’. Hence, if x’ € On, then x € On by Proposition 4.8(e). Conversely, 
assume x € On. We must prove E We (x U {x}) and Trans(x U {x}). Since 
EWexand x ¢ x, E Irr (x 2 {x}). Also, ify # @Ay Cx vU {x}, then either 
y = {x}, in which case the least element of y is x, or yn x # @ and the least 
element of y N x is then the least element of y. Hence, E we (x U {x}). In 
addition, if ye x U {x} and u € y, then u € x. Thus, Trans(x U {x}). 

b. Assume a <, B <, a’. Then, a€ BAB € a’. Since a € B, B a, and B 4a by 

Proposition 4.8(d), contradicting B € a’. 
. Assume a’ = 8’. Then 6 <, a’ and, by part (b), B <, «. Similarly, a <, p. 
Hence, o = £. 


io) 


Exercise 


4.32 Prove: F (Va)(a C a’) 


Definitions 
Suc(X) for X € On A (A0)(X = a’) (X is a successor ordinal) 
K, for {x|x = @ Vv Suc(x)} (the class of ordinals of the first kind) 


w for {x|x EK, AV Eex>ueEK,)} is the class of all ordinals 
a of the first kind such that 
all ordinals smaller than a are 
also of the first kind) 


Example 
-/@EwAI1 Ea. (Recall that 1 = {Q}.) 


Proposition 4.11 


a. (Val(aeE aoa’ €a) 

b. kK M(@) 

ch OEXAWWUEX>U EX)>OCX 
d.- (VVa\aEoAB<,a> 6 Ea) 
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Proof 


a. Assume a € w. Since Suc(a’), a’ € K,. Also, if B € o’, then B € o or B = a. 
Hence, B € K,. Thus, «’ € w. Conversely, if «’ € w, then, since « € a’ and 
(VB)(B € a > B € w’), it follows that a € o. 

b. By the axiom of infinity (I), there is a set x such that @ € x and (Vu)(u € 
x =u’ € x). We shall prove w C x. Assume not. Let a be the least ordinal 
in w — x. Clearly, a 4 @, since @ € x. Hence, Suc(a). So, (AB)(a = 8’). Let 5 be 
an ordinal such that « = 5. Then 5 <, « and, by part (a), 5 € w. Therefore, 
5 ex. Hence, 8’ € x. But « = 8’. Therefore, « € x, which yields a contradic- 
tion. Thus, o € x. So, M@) by Corollary 4.6(b). 

c. This is proved by a procedure similar to that used for part (b). 

d. This is left as an exercise. 


The elements of @ are called finite ordinals. We shall use the standard nota- 
tion: 1 for @’, 2 for 1’, 3 for 2’, and so on. Thus, @ € 0,1 € 0,2 €0,3 €@,.... 

The nonzero ordinals that are not successor ordinals are called limit 
ordinals. 


Definition 
Lim(x) forx€ On Ax €K, 


Exercise 


4.33 Prove: 
a. - Lim() 
b. - (Va)(VB)(Lim(@) A B <, a => B’ <, a). 


Proposition 4.12 


a: t(wx)(x cOn=>[UxeOna(Vva)(a uu, Ux) a(vB)((va) 


(ex a<, BySUxs, B)]). (If x is a set of ordinals, then (J x is an 
ordinal that is the least upper bound of x.) 

b. EK (WxX\(x C On Nx # DA (Vala ex > (APB Ex A a<, B)] > Lim( U x). 
(If x is a nonempty set of ordinals without a maximum, then U x is a 
limit ordinal.) 


Proof 
a. Assume x C On. U x, as a set of ordinals, is well-ordered by E. 
Also, if a € Ux A 8 € a, then there is some y with y € x and a € y. 
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Then 6 € aA a € y; since every ordinal is transitive, B € y. So, B € Ux. 
Hence, U x is transitive and, therefore, Ux e€ On. In addition, if 
a € x, thenac Ux; s0, a <, U x, by Proposition 4.8(c). Assume now 
that (Va)(a € x > a <, B). Clearly, if 5 € U x, then there is some y such 
that 5 € y Ay ex. Hence, y <, B and so, & <, f. Therefore, U x C B and, 
by Proposition 4.8(c), Ux<,p. 

b. Assume x C On Ax # GA (Vala € x > (AP(BExAa<, Pp). If Ux = 
@, then « € x implies « = @. So, x = @ or x = 1, which contradicts our 
assumption. Hence, U x # @ Assume Suc( U x). Then U x = y’ for 
some y. By part (a), U x is a least upper bound of x. Therefore, y is not 
an upper bound of x; there is some 6 in x with y <, 5. But then 5 = U x, since 

x is an upper bound of x. Thus, U x is a maximum element of x, 
contradicting our hypothesis. Hence, = Suc( U x), and Lim( U ») is the 
only possibility left. 


Exercise 


4.34 Prove: 
ab (va)([Suc(a) => (Ua)'= a | A| Lim (a) =>Ua= |). 
b. If @ #x C On, then () x is the least ordinal in x. 


We can now state and prove another form of transfinite induction. 


Proposition 4.13 (Transfinite Induction: Second Form) 


a. F [GO € X A (Vala € X > a’ € X) A (Va)(Lim(@) A (VB)(B <, a > B € X) 
>aeX)}>OncxX 

b. (Induction up to&.) F [GBEX A (Vala<,5AaEX 
>a’ €X) A (Vala <, 5 A Lim(a) A (VB)(B <, o 
>BpEx)>a€ExX)>hCX. 

c. (Induction up tow.) KF OEX A Wala<,aAaEeX>a’ Ex) >oCxX. 


Proof 


a. Assume the antecedent. Let Y = {x|x € On A (Va\(a <,x > a € X)}. It is 
easy to prove that (Va)(a <, y > a € Y) > y € Y. Hence, by Proposition 
4.9, On C Y. But Y € X. Hence, On € X. 

b. The proof is left as an exercise. 

c. This is a special case of part (b), noting that + (Va)(a <, # > =Lim(q)). 


Set theory depends heavily upon definitions by transfinite induction, which 
are justified by the following theorem. 


256 Introduction to Mathematical Logic 


Proposition 4.14 


a. F (VX)(4,Y)(Fne(Y) A AY) = On A (Vo)(Y'a = X' (a LY))). (Given X, there is 
a unique function Y defined on all ordinals such that the value of Y at « 
is the value of X applied to the restriction of Y to the set of ordinals less 
than a.) 

b. F (Vx)(VX4)(VX,)(A,Y)(Enc(Y) A A(Y) = Ona Y'@ =x A (Va)(Y'(@’) = X'(Y'0)) 
A (Wa)(Lim(a) >Y’a = X,' (aL Y))). 

c. (Induction up to 5.) F (Wx)(VX,)(VX,)(4,Y)(Fne(Y) A AY) = 8 A Y'G =x 
A (Va)(a’ <, 5 > Y’ (a’) = X’ (Y’ &)) A (Va)(Lim(a) A a <5 SY’ a = X,’ 
(a LY))). 


Proof 


a. Let Y, = {u|Fne(u) A 9 (u) € On A (Vala € 7 (u) > ua = X'(alu))}. Now, 
if u, € Y, and u, € Y;, then u, € u, or uy C u,. In fact, let y, = 7 (u,) and 
Yo = JY (u,). Either y; <, Y2 OF Y2 So Yu; SAY, ¥1 So Yo. Let w be the set of 
ordinals a <, y, such that u,'a #4 u,’o; assume w # @ and let n be the 
least ordinal in w. Then for all B <, n, u,’B = u,'B. Hence, u,’o = nlu,. But 
uj = X’ (q Lu,) and uy'q = X’ (qluy); and so, uy,’ = uy'n, contradicting 
our assumption. Therefore, w = @; that is, for all a <, y1, uy'a = u,’a. 
Hence, , = y, Ju, = y,luy © uy. Thus, any two functions in Y, agree in 
their common domain. Let Y = U Y,. We leave it as an exercise to prove 
that Y is a function, the domain of which is either an ordinal or the 
class On, and (Va)(a € 7 (Y) > Y’a = X’ (al Y)). That 7 (Y) = On follows 
easily from the observation that, if 7 (Y) = 6 and if we let W = Y u {(, 
X'Y)}, then W € Y;; so, WC Y and 6 € 7(Y) = 54, which contradicts the 
fact that 5 ¢ 5. The uniqueness of Y follows by a simple transfinite 
induction (Proposition 4.9). 


The proof of part (b) is similar to that of (a), and part (c) follows from (b). 
Using Proposition 4.14, one can introduce new function letters by transfi- 
nite induction. 


Examples 
1. Ordinal addition. In Proposition 4.14(b), take 


x=B X,={(u,v)|v=u'} X,={(u,v)|v=Ur(u)} 


Hence, for each ordinal B, there is a unique function YB such that 


Yp'@ =B A(Wa)(%'(a’) = (Yp'e)’ A [Lim(a) => Yo’. =U") 
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Hence there is a unique binary function +, with domain (On)* such that, 
for any ordinals B and y, + ,(B, y) = Ypy. As usual, we write B +, y instead 
of +,(, y). Notice that: 


B+,0=B 
B+o(7')=(Bto ry’ 
Lim(a) >B+,a= e+ T) 


T<o 


In particular, 


B+o.1=B to (S') = (B+. Oy =f 
2. Ordinal multiplication. In Proposition 4.14(b), take 


x=O X={(u,0)|v=utoP} X2={(u,v)|v=| J (u)} 


Then, as in Example 1, one obtains a function B x, y with the properties 
Bx, DJ =O 
Bxo (7) = (BXo ¥) +0 B 
Lim(a) > Bx, a = U (BX, 1) 


T<o O 


Exercises 

4.35 Prove: Bx, 1=BABx,2=6 +4, B. 

4.36 Justify the following definition of ordinal exponentiation* 
exp(B,©@) =1 
exp(B, Y= exp(B, 1) Xo B 


Lim(a) > exp(B, a) = U exp(B, 7) 


D<oT<o 


For any class X, let Ex be the membership relation restricted to X; that is, 
Ex = {(u, v)|uevAuEeX Ave X}. 


* We use the notation exp (6, a) instead of 6" in order to avoid confusion with the notation XY to 
be introduced later. 
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Proposition 4.15* 


Let R be a well-ordering relation on a class Y; that is, R We Y. Let F be a function 
from Y into Y such that, for any u and v in Y, if (u, v) € R, then (F’u, F’v) € R. 
Then, for all vu in Y, u =F’ u or (u, Fu) ER. 

Proof 


Let X = {u|(F’u, u) € R}. We wish to show that X = @. Assume X # @. Since 
X C Yand R well-orders Y, there is an R-least element u, of X. Hence, (F’ug, U9) 
€ R. Therefore (F’(F’uy), F’uy) € R. Thus, F’u, € X, but F’uy is R-smaller than uy, 
contradicting the definition of up. 


Corollary 4.16 


If Y is a class of ordinals, F: Y > Y, and F is increasing on Y (that is, a€ Y AB 
EYAa<, fp > Fa<, F’B), then a <, F’a for all a in Y. 
Proof 


In Proposition 4.15, let R be Ey. Note that E, well-orders Y, by Proposition 
4.8(f) and Exercise 4.25. 


Corollary 4.17 


Let « <, B and y C a; that is, let y be a subset of a segment of B. Then (E,, B) is 
not similar to (E,, y). 


Proof 


Assume (E,, 8) is similar to (E,, y). Then there is a function f from B onto y such 
that, for any u and v inf, u <,v @ fu <, fv. Since the range of fis y, fa € y. But 
y Ga. Hence fa <, a. But, by Corollary 4.16, « <, fa, which yields a contradiction. 


Corollary 4.18 


a. For a # B, (E,, a) and (E,, B) are not similar. 
b. For any a, if fis a similarity mapping of (E,, «) with (E,, a), then fis 
the identity mapping, that is, f’B = B for all B <, a. 


* From this point on, we shall express many theorems of NBG in English by using the cor- 
responding informal English translations. This is done to avoid writing lengthy wfs that are 
difficult to decipher and only in cases where the reader should be able to produce from the 
English version the precise wf of NBG. 
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Proof 


a. Since a # 8, it follows by Proposition 4.8(d, c) that one of « and 6 is a 
segment of the other; say, « is a segment of 8. Then Corollary 4.17 tells 
us that (E,, B) is not similar to (E,, a). 

b. By Corollary 4.16, f’B >, B for all B <, «. But, noting by Exercise 4.26(b) that 
f isasimilarity mapping of (E,, «) with (E,, a), we again use Corollary 
4.16 to conclude that (f)’B >, B for all B <, a. Hence B = (f)'(f'B) >. f 'B >. 
B and, therefore, f’B = B. 


Proposition 4.19 


Assume that a nonempty set u is the field of a well-ordering r. Then there is a 
unique ordinal y and a unique similarity mapping of (E,, y) with (7, u). 


Proof 


Let F = {(y, w)|w eEu-va (Wz2zEu-v=> (z,w) En}. Fis a function such that, 
if v is a subset of u and u — v # @, then F’v is the r-least element of u — v. Let 
X = {(v, w)| AW), w) € F}. Now we use a definition by transfinite induction 
(Proposition 4.14) to obtain a function Y with On as its domain such that (Vo) 
(Ya = X’ @LY)). Let W = {a Ya Cu Au- Ya ¢ Qh}. Clearly, if a € Wand B Ea, 
then B € W. Hence, either W = On or Wis some ordinal y. (If W 4 On, let y be the 
least ordinal in On — W) If « € W, then Y’'a = X’ (al Y) is the r-least element of 
u—Y"o so, Y'o € u and, if B Ea, Ya # YB. Thus, Y is a one-one function on W 
and the range of Y restricted to Wis a subset of u. Now, leth = (WILY) and f =h; 
that is, let fbe the inverse of Y restricted to W. So, by the replacement axiom (R), 
W isa set. Hence, W is some ordinal y. Let g = yl Y. Then g is a one-one func- 
tion with domain y and range a subset u, of u. We must show that u, = u and 
that, if « and B are in y and B <, a, then (g’B, ga) € r. Assume a and 6 are in 
y and 6B <, a. Then g”B € g’a and, since ga € u — g"a, ga € u — g"B. But 9B 
is the r-least element of u — g’’B. Hence, (g'B, g’a) € r. It remains to prove that 
u, = u. Now, u, = Y”y. Assume u —u, # @. Then y € W. But W = y, which yields 
a contradiction. Hence, u = u,. That y is unique follows from Corollary 4.18(a). 


Exercise 


4.37 Show that the conclusion of Proposition 4.19 also holds when u = @ and 
that the unique ordinal y is, in that case, @. 


Proposition 4.20 


Let R be a well-ordering of a proper class X such that, for each y € X, the class 
of all R-predecessors of y in X (i.e., the R-segment in X determined by y) is a set. 
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Then R is “similar” to E,,; that is, there is a (unique) one-one mapping H of 
On onto X such that a € B © (H’a, H’B) € R. 


Proof 


Proceed as in the proof of Proposition 4.19. Here, however, W = On; also, one 
proves that .A(Y) = X by using the hypothesis that every R-segment of X is a 
set. (If X — AY) 4 @, then, if w is the R-least element of X — ~(Y), the proper 
class On is the range of Y, while the domain of Y is the R-segment of X deter- 
mined by w, contradicting the replacement axiom.) 


Exercise 


4.38 Show that, if X is a proper class of ordinal numbers, then there 
is a unique one-one mapping H of On onto X such that a € B © H’a 
€ H’B. 


4.3 Equinumerosity: Finite and Denumerable Sets 


We say that two classes X and Y are equinumerous if and only if there is a one— 
one function F with domain X and range Y. We shall denote this by X =Y. 


Definitions 


X=Y for Fne(F)a 7 (F)=Xa.“(F)=¥ 
XY for(AF) (X=Y) 


Notice that F (Vx)(Vy)(x = y & (Az)(xzy)). Hence, a wf x & y is predicative 
(that is, is equivalent to a wf using only set quantifiers). 
Clearly, if X = Y, then Y = X, where G = F. Also, if X = Y and Y = Z, then X = Zi 


A 
where H is the composition F, o F;. Hence, we have the following result. 


Proposition 4.21 


arxex 
brEFX2zY3Y2xX 
crFXeYAYSZ>Xe2Z 
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Proposition 4.22 


ak(XZYAZZEWAXNZ=AGBAYNW=BO)S>XUZZEYUW 
bE(X2ZYAZZW)S>XxZe2Yxw 

cK Xx {y} =X 

d.EXxY2YxxX 

e. (Xx Y)x Z2X x (Y x Z) 


Proof 
a. Let XSY and Z=W, Then XUZ=YUW, where H = F UG. 

Let X=Y and Z=W. Let H = {(u, v)|(Ax)yx e XAYEZ Au = (x,y) 

Av =(E'x, G’y))}. Then XxZ=YxW. 


b. 


. Let F= {(u, v)|ue X Av = (u, y)}. Then X=X x {y}. 
; a = he 1 Gaayee EXAYEYA “= (x, Y) Av = (y, x))}. Then 


: Let F = {(u, v) |(Axy\(Ay(azeee XAYEYAZEZAU= (x,y), 2AV= 


(x, (y, Z)))}. Then (X x Y) x Z = Xx(YxZ). 


Definition 


XY for {ulu: Y > X} 
XY” is the class of all sets that are functions from Y into X. 


Exercises 


Prove the following. 
4.39 - (WX)(VY)(AX, (AY (X = XAY SY, AX NY, = @) 
4.40 b.7(y) = 2¥ (Recall that 2 = {@, 1} and 1 = {@}.) 


4.41 


4.42 


4.43 
4.44 
4.45 
4.46 
4.47 
4.48 


a. 
b. 
a. 
b. 
c. 


F AM(Y) > XY=@ 
F (wx)(Vy) M(x!) 

- X@=1 

Filys1 
FY4OD>O’=G 


bX = Xl 

FX2ZYAZZWS=> xX42YV 
FE XNY=@G => LZXU"& Z* x ZY 
(vay) (Ve) [ey & x 

F (Xx Y)% & XZ x Y% 

F (Wx)(VR)(R We x => (Aa)(x = a) 
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We can define a partial order < on classes such that, intuitively, X < Y if 
and only if Y has at least as many elements as X. 


Definitions 


X < Y for (AZ\Z CY AX2Z) 
(X is equinumerous with a subclass of Y) 


X~<YforX<YA71(X2£Y) 
(Y is strictly greater in size than X) 


Exercises 


Prove the following. 

449 | X<YS(X<YVXEZY) 

4.50 FX < YA -=M(X) => -M(Y) 

4.51 /X< YA (AZ)(Z We Y) => (AZ)(Z We X) 

4.52 + (Va)(VB)(o < B V B X a) [Hint: Proposition 4.8(k).] 


Proposition 4.23 


a. X< XA 7X < X) 

bEFXCYS>X<Y 

cr X<YAYKZ>XKZ 
d.-}X<YAY<X=>X2zY (Bernstein's theorem) 


Proof 


(a), (b) These proofs are obvious. 
c. Assume xy AY C YaYeLZ, AZ, C Z. Let H be the composition of 
F andG. Then 7(H)cZaAXz2.#(H).S0, XX Z. 
d. There are many proofs of this nontrivial theorem. The following one 
was devised by Hellman (1961). First we derive a lemma. 


Lemma 


Assume XN Y=@, XNZ=@Wand YN Z=G@,and let X =X UY UZ. Then there 
is a G such that X= X UY. ‘ 
Proof 


Define a function H on a subclass of X x w as follows: ((u, k), v) € H if and 
only if u € X and k € w and there is a function f with domain k’ such that 
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f@=F'uand, ifj €k, then fj € X and f(j’) = F(fj) and f’k = v. Thus, H’((u, @)) = 
F’u, H’((u, 1)) = F’(F’u) if F’u € X, and H’((u, 2)) = F’(F'(F'u)) if F’u and F’(F’u) 
are in X, and so on. Let X* be the class of all u in X such that (Ay)\(y Eo A 
(u, y) € 7 (H) A H'((u, y)) € Z). Let Y* be the class of all u in X such that (Vy) 
(y¥Eod tu, y) € 7(H) > Hu, y)) ¢ Z). Then X = X* U Y*. Now define G 
as follows: 7 (G) = X and, if u € X*, then G'u = u, whereas, if u € Y*, then 
G'u = F’u. Then X=X UY.(This is left as an exercise.) 

Now, to prove Bernstein’s theorem, assume X=Y, AY; CY AY=X1 AX, CX. 
Let A = G’Y, € X, C X. But An (X, -A) = @, An (X —X,) = Band (X -X,) 
N (X, -A) = @. Also, X = (X —X,) U (X, -A) U A, and the composition H of F 
and G is a one-one function with domain X and range A. Hence, A=X. So, 


H 
by the lemma, there is a one-one function D such that A = Xj (since (X; —A) U 
A = X,). Let T be the composition of the functions H, D and G; that is, T’u = 
(G)'(D’(H'u)). Then X=Y, since X=A and A=X, and X, =Y. 


Exercises 


4.53 Carry out the details of the following proof (due to J. Whitaker) 
of Bernstein’s theorem in the case where X and Y are sets. Let 
Xe ny CYAYEXI AX c X. We wish to find a set Z € X such 
that G, restricted to Y — F”Z, is a one-one function of Y — F”Z onto X 
—Z. [If we have such a set Z, let H = (ZI F)U((X — Z)!G); that is, H'x = 
F'x for x € Z, and H’x = G’x for x € X -Z. Then X=Y|] Let Z = {x|(au) 
UuUCcCKaxEuaG"(Y - Fu) CX — u)}. Notice that ‘this proof does not 
presuppose the definition of nor any other part of the theory of 
ordinals. 


4.54 Prove: (a) X< XUY(b) FXKYS-A(¥YKX)(VDFXKYAYXZ> 
X<Z 


Proposition 4.24 


Assume X < Y and A X B. Then: 


a. YNB=@>XVAKXYUB 
b XxAXYxB 
c. X4< Y®if Bisaset and -(X =A=Y=@AB#Z@) 


Proof 


a. Assume X =v <Y and A=B, cB. Let H be a function with domain 


X UA such that H’x = F’x for x € X, and H’x = G’x for x € A —X. Then 
XUAZH"(XUA)CY UB. 


b. and (c) are left as exercises. 
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Proposition 4.25 


a. F A(af\(Enc(f) A 9 (f) =x A A(f) =.7(x)). (There is no function from x 
onto .7(x).) 
b. x < A(x) (Cantor’s theorem) 


Proof 


a. Assume Fne(f) A7 (f) = x A 7(f) = 7(x). Let y = {uluexau ¢ fu}. 
Then y € ./(x). Hence, there is some z in x such that f’z = y. But, (Vu)(u 
eEyseuexau€fu). Hence, WuuefzeuexaAu€ fu). By rule A4, 
ZEfZSzZExaAzE fz. Since z € x, we obtain z € fz &z € f'z, which 
yields a contradiction. 

b. Let fbe the function with domain x such that fu = {u} for each u in x. 
Then f’x € .7(x) and fis one-one. Hence, x < .7(x). By part (a), x = (x) 
is impossible. Hence, x < .7(x). 


In naive set theory, Proposition 4.25(b) gives rise to Cantor’s paradox. If we 
let x = V, then V <.A(V). But .A(V) CV and, therefore, .7(V) < V. From V <./(V), 
we have V < .(V). By Bernstein’s theorem, V = .7(V), contradicting V < .(V). 
In NBG, this argument is just another proof that V is not a set. 

Notice that we have not proved - (Vx)(Vy)(x < y V y < x). This intuitively 
plausible statement is, in fact, not provable, since it turns out to be equivalent 
to the axiom of choice (which will be discussed in Section 4.5). 

The equinumerosity relation = has all the properties of an equivalence 
relation. We are inclined, therefore, to partition the class of all sets into 
equivalence classes under this relation. The equivalence class of a set x would 
be the class of all sets equinumerous with x. The equivalence classes are 
called Frege—Russell cardinal numbers. For example, if u is a set and x = {u}, then 
the equivalence class of x is the class of all singletons {v} and is referred to 
as the cardinal number 1,. Likewise, if u 4 v and y = {u, v}, then the equiva- 
lence class of y is the class of all sets that contain exactly two elements and 
would be the cardinal number 2,; that is 2, is {x|(Aw)(Az)(w # z A x = {w, z})}. 
All the Frege—Russell cardinal numbers, except the cardinal number 
O, of @ (which is {@}), turn out to be proper classes. For example, V = 1,. 
(Let F’x = {x} for all x. Then V=1..) But, -M(V). Hence, by the replacement 
axiom, =M(1,). 7 


Exercise 


4.55 Prove F =M(2,). 


Because all the Frege—Russell cardinal numbers (except O,) are proper 
classes, we cannot talk about classes of such cardinal numbers, and it is dif- 
ficult or impossible to say and prove many interesting things about them. 
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Most assertions one would like to make about cardinal numbers can be 
paraphrased by the suitable use of =, x, and <. However, we shall see later 
that, given certain additional plausible axioms, there are other ways of defin- 
ing a notion that does essentially the same job as the Frege—Russell cardinal 
numbers. 

To see how everything we want to say about cardinal numbers can be said 
without explicit mention of cardinal numbers, consider the following treat- 
ment of the “sum” of cardinal numbers. 


Definition 


X +.Y for (X x {@}) u (Y x {1}) 

Note that F @ # 1 (since 1 is {@}). Hence, X x {@} and Y x {1} are disjoint 
and, therefore, their union is a class whose “size” is the sum of the “sizes” of 
X and Y. 


Exercise 


4.56 Prove: 

a EFX<X+,. YAY X+,.Y 
FXZAAYZBS>X+, YZA+ B 
EX+,.Y2Y+,X 
bt M(X +, Y) @ M(X) A M(Y) 
rE X+.(¥+,.Z)2=(X+,.Y)+,.Z 
EX<YS>X4+,.Z25Y+,.Z 
t X+,X =X x 2 (Recall that 2 is {@, 1}.) 
EXY+c2 & XY x XZ 
rx2ext.1 52% 4+." 22% 


Pwo neo an 


ee 


4.3.1 Finite Sets 


Remember that o is the set of all ordinals « and all smaller ordinals are 
successor ordinals or @. The elements of are called finite ordinals, and the 
elements of On — @ are called infinite ordinals. From an intuitive standpoint, 
w consists of @, 1, 2, 3, ..., where each term in this sequence after @ is the 
successor of the preceding term. Note that @ contains no members, 1 = {@} 
and contains one member, 2 = {@, 1} and contains two members, 3 = {@, 1, 2} 
and contains three members, etc. Thus, it is reasonable to think that, for each 
intuitive finite number n, there is exactly one finite ordinal that contains 
exactly n members. So, if a class has n members, it should be equinumerous 
with a finite ordinal. Therefore, a class will be called finite if and only if it is 
equinumerous with a finite ordinal. 
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Definition 
Fin(X) for (da)(ae@aX =a) (X is finite) 


Exercise 


4.57 Prove: 
a. - Fin(X) > M(X) (Every finite class is a set) 
b. F (Vo)(a € w > Fin(a)) (Every finite ordinal is finite.) 
c. EF Fin(xX)a XY = Fin(Y) 


Proposition 4.26 


a. (Vala éo> azo’). 

b. F (Vo)(VP)(aEaArAa¢B > 7a & £)). (No finite ordinal is equinumer- 
ous with any other ordinal.) 

c. F (Va)(Vx)(a Eo Ax C a> (a = x). (No finite ordinal is equinumer- 
ous with a proper subset of itself.) 


Proof 


a. Assume « ¢ w. Define a function : ae domain a’ as follows: f'5 = 8’ 
if5ea,f5=dsif5Ea A5¢aU {a}; andfa=@. Thena'=a. 

b. Assume this is false, and let « be the least ordinal such that a € @ 
and there is B 4 a such that « & B. Hence, a <, B. (Otherwise, B would 
be a smaller ordinal than « and B would also be in w, and B would 
be equinumerous with another ordinal, namely, a.) Let a =f. If a = @, 
then f= @ and B = @, contradicting « # B. So, a # @. Since d Eo,a=6 
for some 5 € w. We may assume that B = y’ for some y. (If B € @, then 
6 # @; and if B ¢ , then, by part (a), B = B’ and we can take f’ instead 
of B.) Thus, 5’ = ae Also, 5 # y, since a # B. 


Case 1. f'5 = y. Then 6=y, where g = 8! f 


8 
Case 2. f'5 # y. Then there is some 1 € 6 such that fp = y. Leth = (6S f) - 


{(u, Mpa. {(u, f’5)}; that is, let h’t = f’t if t  {5, p}, and h’p = f'5. 
Then 


In ok cases, 6 is a finite ordinal smaller than « that is equinumer- 
ous with a different ordinal y, contradicting the minimality of «. 


. Assume BE WAX CRAB Sx holds for some £, and let « be the least 
such . Clearly, « 4 @; hence, a = y’ for some y. But, as in the proof 
of part (b), one can then show that y is also equinumerous with a 
proper subset of itself, contradicting the minimality of a. 


io) 
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Exercises 


4.58 Prove: F (Va)(Fin(a) = a € @). 


4.59 Prove that the axiom of infinity (I) is equivalent to the following 
sentence. 


(*) Gx)(Gujuex)a(vyy ex = Gz\(zexaycz))) 


Proposition 4.27 


a. k Fin(X) A YC X => Fin(Y) 
b. F Fin(X) > Fin(X u {y}) 
c. F Fin(X) A Fin(Y) > Fin(X u Y) 


Proof 


a. Assume Fin(X) A Y € X. Then X & a, where « € w. Let g = Y!If and 
W =9"Y Ca. Wis a set of ordinals, and so, Ey is a well-ordering of 
W. By Proposition 4.19, (Ey, W) is similar to (E,, B) for some ordinal 
B. Hence, W & £. In addition, B <, «. (If « <, B, then the similarity of 
(Ey, B) to (Ew, W) contradicts Corollary 4.17) Since « € w, B € o. From 
YEW AW <8, it follows that Fin(Y). 


& 

b. If y € X, then X U {y} = X and the result is trivial. So, assume y ¢ X. 
From Fin(X) it follows that there is a finite ordinal « and a func- 
tion f such that a=X. Let g =f U {(a, y)}. Then a’=X U{y}. Hence, 
Fin(X U {y}). i : 

. LetZ ={uljueoar (Wavy eeu A Fin(y) > Fin(x v y))}. We must 


show that Z = o. Clearly, @ € Z, for if x = @, thenx =@ and xuUy=y. 
Assume that « € Z. Let x=a’ and Fin(y). Let w be such that f'w = « 
and let x, = x —{w}. Then x, = a. Since « € Z, Fin(x, U y). But x Uy = 
(x, U y) U {w}. Hence, by part (b), Fin(x U y). Thus, a’ € Z. Hence, by 
Proposition 4.11(0), Z = a. 


O 


Definitions 


DedFin(X) for M(X) A (VY)(Y c X > 7(X & Y)) 
(X is Dedekind-finite, that is, X is a set that is not equinumerous with any 
proper subset of itself) 

DedInf(X) for M(X) A =DedFin(X) 
(X is Dedekind-infinite, that is, X is a set that is equinumerous with a proper 
subset of itself) 
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Corollary 4.28 
(Vx)(Fin(x) > DedFin(x)) (Every finite set is Dedekind-finite)* 


Proof 


This follows easily from Proposition 4.26(c) and the definition of “finite.” 


Definitions 

Inf(X) for =Fin(X) (X is infinite) 
Den(X) for X = @ (X is denumerable) 
Count(X) for Fin(X) v Den(X) (X is countable) 
Exercise 

4.60 Prove: 


a. FInf(X) A X = Y => Inf(Y) 

. & Den(X) A X = Y > Den(Y) 

. & Den(X) > M(X) 

. F Count(X) A X = Y > Count(Y) 
. - Count(X) > M(X) 


o 2 5p oO 


Proposition 4.29 


a. F Inf(X) A X C Y => Inf(Y) 
b. F Inf(X) © Inf(X U {y}) 

c. F DedInf(X) => Inf(X) 

d. + Inf(@) 


Proof 


a. This follows from Proposition 4.27(a). 

b. F Inf(X) > Inf(X vu {y}) by part (a), and + Inf(X U {y}) > Inf(X) by 
Proposition 4.27(b) 

c. Use Corollary 4.28. 

d. tw ¢o. If Fin), then o = o for some a in w, contradicting Proposition 
4.26(b). 


* The converse is not provable without additional assumptions, such as the axiom of choice. 
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Proposition 4.30 


F (¥v)(V¥z)(Den(v) A z € v > Count(z)). (Every subset of a denumerable set is 
countable.) 


Proof 


It suffices to prove that z € wm > Fin(z) v Den(z). Assume z € A =Fin(z). Since 
=Fin(z), for any «in z, there is some B in z with a <, 6. (Otherwise, z € a’ and, 
since Fin(a’), Fin(z).), Let X be a function such that, for any « in w, X’a is the 
least ordinal B in z with a <, 8B. Then, by Proposition 4.14(c) (with 5 = a), there 
is a function Y with domain o such that Y’@ is the least ordinal in z and, for 
any y in @, Y“(y’) is the least ordinal B in z with B >, Y’y. Clearly, Y is one-one, 
AY) =, and Y"w € z. To show that Den(z), it suffices to show that Y"@ = z. 
Assume z — Y"@ # @. Let 6 be the least ordinal in z — Y’a, and let t be the least 
ordinal in Yo with t >, 6. Then t = Y’o for some o in . Since 6 <, 1, 6 # ©. 
So, o = p’ for some p: in w. Then t = Yo is the least ordinal in z that is greater 
than Y'p. But 8 >, Y’p, since t is the least ordinal in Y’o that is greater than 6. 
Hence, t <, 5, which contradicts 5 <, Tt. 


Exercises 


4.61 Prove:  Count(X) A Y C X > Count(Y). 
4.62 Prove: 
a. - Fin(X) => Fin((X)) 
b. FFin(X) A (V y)(y € X > Fin(y)) > Fin(U X) 
c. EX<YA Fin(Y) > Fin(X) 
d. + Fin((X)) > Fin(X) 
e. -FFin(L) X)>Fin(X) a (V y)(y € X => Fin(y)) 
f. - Fin(X) > (X<YVY<X) 
g. - Fin(x) \Inf(y) > X<Y 
h FRFinXy)AYCX>Y<xX 
i. - Fin(X) a Fin(Y) > Fin(X x Y) 
j. / Fin(X) A Fin(Y) > Fin(X”) 
k. FFinQX)ayé X>X «XU {y} 


4.63 Define X to be a minimal (respectively, maximal) element of Y if and only 
if X € Y and (Vy)(y € Y > 7(y C X)) (respectively, (Vy)\(y € Y > -(X C y))). 
Prove that a set Z is finite if and only if every nonempty set of subsets 
of Z has a minimal (respectively, maximal) element (Tarski, 1925). 
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4.64 Prove: 
a. - Fin(X) A Den(Y) > Den(X U Y) 
b. - Fin(X) A Den(Y) A X 4 @ > Den(X x Y) 


c. - (Vx)[DedInf(x) = y)(y ¢ x A Den(y))]. (A set is Dedekind-infinite 
if and only if it has a denumerable subset) 


F (val(ay)(y ¢ x A Den(y)) = o < x] 

+ (Va)\(a ¢€ o > DedInf(a)) A (Va)(Inf@) > « € a) 
F (Wx\(Vy)\(y ¢ x > [DedInf(x) = x = x U {y}]) 

F (Vx\lo <xex+.1 2%) 


ce mo 


4.65 If NBG is consistent, then, by Proposition 2.17, NBG has a denumer- 
able model. Explain why this does not contradict Cantor’s theorem, 
which implies that there exist nondenumerable infinite sets (such as 
/(@)). This apparent, but not genuine, contradiction is sometimes called 
Skolem's paradox. 


4.4 Hartogs’ Theorem: Initial Ordinals—Ordinal Arithmetic 


An unjustly neglected proposition with many uses in set theory is Hartogs’ 
theorem. 


Proposition 4.31 (Hartogs, 1915) 


F (Vx)(da)(Vy)(y G x > 7(@ = y)). (For any set x, there is an ordinal that is not 
equinumerous with any subset of x.) 


Proof 


Assume that every ordinal « is equinumerous with some subset y of x. 
Hence, y=a for some f. Define a relation r on y by stipulating that (u, v) 


€ rif and only if fu € fv. Then r is a well-ordering of y such that (r, y) 
is similar to (E,, «). Now define a function F with domain On such that, 
for any a, F’a is the set w of all pairs (z, y) such that y C x, z is a well- 
ordering of y, and (E,, «) is similar to (z, y). (w is a set, since w € .7(x x xX) x 
7 (x).) Since, F(On) © 7 (7(x x x) x .7(x)), F(On) is a set. F is one-one; 
hence, On = F"(F"(On)) is a set by the replacement axiom, contradicting 
Proposition 4.8(h). 
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Definition 
Let 7 denote the function with domain V such that, for every x, 7'x is the 


least ordinal « that is not equinumerous with any subset of x. (7 is called 
Hartogs’ function.) 


Corollary 4.32 


(Vx)(74'X < ZPPA(X)) 


Proof 


With each B <, 7’x, associate the set of relations r such thatr Cx x x,risa 
well-ordering of its field y, and (r, y) is similar to (E,, 8). This defines a one— 
one function from 7‘x into .~/(x x x). Hence, 7’x < .77(x x x). By Exercise 
4.12(s),x xx C.47(x). So, 77(x x x) C .Z277(x), and therefore, 7'x < Z777(x). 


Definition 


Init(X)for X e On a(VB)(B <, X > -(B = X)) 
(X is an initial ordinal) 


An initial ordinal is an ordinal that is not equinumerous with any smaller 
ordinal. 


Exercises 


4.66 a. | (Va)(0 € w = Init(@)). (Every finite ordinal is an initial ordinal.) 


b. F Init(@). 
[Hint: Use Proposition 4.26(b) for both parts.] 
4.67 Prove: 


a. For every x, 7’x is an initial ordinal. 
b. Forany ordinal a, 7’o is the least initial ordinal greater than a. 


c. Forany set x, 7x = if any only if x is infinite and x is Dedekind- 
finite. [Hint: Exercise 4.64(c).] 


Definition by transfinite induction (Proposition 4.14(b)) yields a function G 
with domain On such that 
G'S =o 
G'(a')= 7'(G'a) for every a 
GA = U(G"(A)) for every limit ordinal A 
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Proposition 4.33 


a. F (Va)(Init(G’a) A @ <, G’a A (VB)(B <, a > G’B <, Ga) 
b. F (Wa)(a <, G’a) 
c. F (VB)@ <, B A Init(B) > (a)(G'a = B)) 


Proof 


a. Let X = {a|Init(G’a) A w <, G’a A (VB)(B <, a > G'B <, G’a)}. 

We must show that On C X. To do this, we use the second form of 
transfinite induction (Proposition 4.13(a)). First, @ € X, since G’@ = a. 
Second, assume « € X. We must show that a’ € X. Since a € X, Ga 
is an infinite initial ordinal such that (VB)(B <, « > G’B <, G’a). By 
definition, G’(a’) = 7'(G’a), the least initial ordinal >, G’(a). Assume 
B <, a’. Then B<,aVvB =a. If B <a, then, since a € X, G’B <, Gia <, 
Ga’). If B = a, then G’B = G’a <, Gq’). In either case, G’B <, G’"(«’), 
Hence, a’ € X. Finally, assume Lim(a) A (VB)(B <o « > B € X). We 
must show that a € X. By definition, G’a = U (G’@). Now consider 
any B <, a. Since Lim(a), 8’ <, a. By assumption, fp’ € X, that is, G’(p’) 
is an infinite initial ordinal such that, for any y <, 6, G’y <, Gp’). It 
follows that G’(a) is a nonempty set of ordinals without a maximum 
and, therefore, by Proposition 4.12, G’a, which is U (G’@), is a limit 
ordinal that is the least upper bound of G’(@). To conclude that G’a € X, 
we must show that G’a is an initial ordinal. For the sake of contradic- 
tion, assume that there exist 5 such that 5 <, G’(a) and 6 = G’a. Since 
G’ais the least upper bound of G’(a), there must exist some p in G"(q) 
such that 5 <, p. Say, p = G’B with B <, a. So, 5 C p = GB Cc G8) 
G’a = 5. Since 6 C G’(p’), 85 € GB’) and 6 < G’(p’). On the other hand, 
since G'(B’) € Ga & 8, Gf’) < 5. By Bernstein’s theorem, 5 = G’(f’), 
contradicting the fact that G’(B’) is an initial ordinal. 

b. This follows from Corollary 4.16 and part (a). 

c. Assume, for the sake of contradiction, that there is an infinite initial 
ordinal that is not in the range of G, and let o be the least such. 
By part (b), o <, G’o and, by part (a), G’o is an initial ordinal. Since 
o is not in the range of G, o <, G’o. Let 1: be the least ordinal such that 
o <, G'p. Clearly, 1 # @, since G'S = w <, o. Assume first that 1 is a suc- 
cessor ordinal y’. Then, by the minimality of 1, G’y <, 6. Since G’(y’) = 
%'(G'y), Gy’) is the least initial ordinal greater than G’y. However, 
this contradicts the fact that o is an initial ordinal greater than G’y 
and o <, Gy’). So, p must be a limit ordinal. Since G’p = U (G"(y), 
the least upper bound of G’(y), and o <, G’p, there is some 6 <, p such 
that o <, G'S <, G'p, contradicting the minimality of p. 


Thus, by Proposition 4.33, G is a one-one <,-preserving function from On 
onto the class of all infinite initial ordinals. 
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Notation 


o, for G’a 

Hence, (a) ®, = @; (b) @, is the least initial ordinal greater than @,; (©) for a 
limit ordinal A, @, is the initial ordinal that is the least upper bound of the 
set of all @, with y <, A. Moreover, , >, « for all a. In addition, any infinite 
ordinal « is equinumerous with a unique initial ordinal w, <, «, namely, with 
the least ordinal equinumerous with «. 

Let us return now to ordinal arithmetic. We already have defined ordi- 
nal addition, multiplication and exponentiation (see Examples 1-2 on 
pages 256-257 and Exercise 4.36). 


Proposition 4.34 


The following wfs are theorems. 

a. B+,1=8' 

b. @+,P=8 
-DK<P>AaK<,a+, PAB <, a+, B) 
Boys at,B<ja+,y 
o+,p=a+,56>B=5 
.a<, 6 > (4, (+, 5 = B) 
O*xcOn>at+, UP= U(a+.B) 
Gach Te pag 2gh op 
-OKANGDK<PSa<ax,B 
Y<,BAD<,a>ax, y<,ax,B 
PROUD OG = OP) 


roe BD OO WO an 


Proof 


a. B+, 1=B +0’) = (6 +, O)' = 

b. Prove @ +, B = B by transfinite induction (Proposition 4.13(a)). Let 
X = {6|@ +, B = Bp}. First, @ € X, since @ +, O=G. If @ +, y =y, then 
G@+,y =O+,y) =y'. If Lim(@) and @ +, t = 7 for all t <, a, then 
OB+.0=Urea(Ot.t)=Ureut=a, since Ur<at is the least upper 
bound of the set of all t <, «, which is a. 

. Let X = {BJO <, B > a <, a +, B}. Prove X = On by transfinite 
induction. Clearly, @ € X. If y € X, then a <, a +, y; hence a <, 
a+, Y < (0 +, y) = a+, y’. If Lim(A) and t € X for all t <, A, then 
a<o W =A4515. Urey a(Ato T) = a+. A. The second part is left as an 
exercise. 

d. Let X = {y|(Va)(VB)(B <, y > a+, B <, a+, y)} and use transfinite induc- 

tion. Clearly, @ € X. Assume y € X and 8 <, y’. Then B <, y or B = y. If B 
<, y then, since y € X, a+, B<, 0+, <a +, y)’ =a+, y’ If B = y, then 


io) 
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a+,P=Aa4+,Y <, (a+, y)’ =a+, y’. Hence, y’ € X. Assume Lim(A) and 
t € X for all t <, A. Assume B <, A. Then B <, t for some Tt <, A, since 
Lim(A). Hence, since t € X, & +5 B <p % to T <p Ure g(Gto T= A454. 
Hence, A € X. 
. Assume a +, B = a+, 5. Now, either B <5 or 5 <, B or 5 = B. If B <, 5, then 
a+, B<, a+, 5 by part (), and, if 5 <, B, then a +, 5 <, «+, B by part (d); 
in either case, we get a contradiction with a +, B = « +, 5. Hence, 5 = B. 
The uniqueness follows from part (e). Prove the existence by induc- 
tion on f. Let X = {Bla <, B > G, 5a +, 5 = P)}. Clearly, @ € X. 
Assume y € X and a <, y’. Hence, a = y or a <, y. If a = y, then (38) 
(a+,5=y’), namely, 5 = 1. If a <, y, then, since y € X, (4, 5)(a +, 5 = y). 
Take an ordinal o such that a+, 6 =y. Thena+, 0’ =(@+, 0) =y% 
thus, (45)(@ +, 5 = y’); hence, y’ € X. Assume now that Lim(A) and 
t € X for all t <, 4. Assume « <, A. Now define a function f such 
that, for a <, p<, A, f ‘pis the unique ordinal 5 such that a +, 6 = p. 
But A=Unencab =Uscper(ato fu). Let p=Uacner(f'p). Notice 
that, if « <, p<, A, then fp <. fu’); hence, p is a limit ordinal. Then 
N= ae, weal +o f'b) = Wee p(a to 0) =QA+. Pp. 
g. Assume @ # x C On. By part (f), there is some 6 such that 
a+. 5 =Usex(a +. B). We must show that 6 = UpexB. If B € x, then a +, 
B < a+ .6. Hence, B < .5 by part (d). Therefore, 5 is an upper bound 
of the set of all B in x. So, UgexB <. 5. On the other hand, if B € x, then 
a+. B So a+. UpexB. Hence, a+. 6 =Uper(a +. B) So & +. UperB. Hence, 
a+. 5 =Upex(at. B) So & +o UpexB and so, by part (d), 5< UserB. 
Therefore, 5 = UsexB. 


(h)-(k) are left as exercises. 


io) 
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Proposition 4.35 


The following wfs are theorems. 


a. Bx,1=BA1x,B=B 

b. @x,Pp=G 

Cc. @ +, B) +) ¥ =H +. (B +, ¥) 

d. (& x, B) x, ¥ = & X, (B XY) 

e. a x, (B+, ¥) = (@ x, B) ae @ X,Y) 

f. exp(B, 1) = B A exp(1, B) = 1 

g- exp(exp(P, 7), 5) = exp(B, y x, 5) 

h. exp(B, y +, 8) = exp(B, vy) x, exp(B, 6)* 
i. a>, 1AB<, ¥ > exp@, B) <,exp@ ¥) 


*In traditional notation, the results of (f)-(h) would be written as B! = p, 1? = 1, 


(BY) =B"°", Br =B" x, BS 
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Proof 

a. B x, 1=6 x, @' = (B x, @) +, B = @ +, B = B, by Proposition 4.34(b). 
Prove 1 x, B = 6 by transfinite induction. 

b. Prove @ x, B = @ by transfinite induction. 

c. Let X = {y|(V @&(V B)(@ +, B) + y = a +,(B +, y)}. @ € X, since @ +, 
6) + © = (a+ 6) =a+ ,(B + .S). Now assume y € X. Then (a + ,f) + 
oY = @ +, B) +. = @ +, (B +. y) =a +, B+, ¥)' = a+, (B +, ¥’). 
Hence, y’ € X. Assume now that Lim(y) and t € X for all t <, A. Then 
(outs B)+o Y= Ur<oa((at+o B) to T) =U coun to (B +oT))= A +o Ur<oa(B tau 
by Proposition 4.34(g), and this is equal to « +,(B +, A). 

(d)-(i) are left as exercises. 


We would like to consider for a moment the properties of ordinal addition, 
multiplication and exponentiation when restricted to a. 


Proposition 4.36 


Assume a, B, y are in w. Then: 


.&+,8R Eo 

ax,pEo 

. exp(a, B) € 

at+,B=B+,0 

ax,p=Bx,a 

» (H+, B) Xo ¥ = (@ XY) to B XY) 

- exp x, B, ¥) = exp@, Y) x, exp(B, ¥) 


RP mp ao oe 


Proof 


a. Use induction up to o (Proposition 4.13(c)). Let X = {B|B € w A (Va) 
(«€o>a+,B € a). Clearly, @ € X. Assume B € X. Consider any « € 
w. Then a +, B € . Hence, « +, B’ = (« +, B)’ € @ by Proposition 4.11 (a). 
Thus, B’ € X. 

b. and (c) are left as exercises. 

d. Lemma. -ae€oAPEo>a' +, fh =a4, B’. Let Y = {B|B €@A (Va) 
(aE€o>a’ +.B =a+4, B’)}. Clearly, @ € Y. Assume f € Y. Consider 
any & € @. So, a +, B =a +, 6’. Then a’ +, B’ = (a’ +, B)’ = (a +, B’)’ = 
a +,(6’)’. Hence, B’ € Y. 

To prove (d), let X = {B|BE aA VA Eo >a+,R=B +, a}. Then @ € X 
and it is easy to prove, using the lemma, that B € X > f’ € X. 
(e)-(g) are left as exercises. 


The reader will have noticed that we have not asserted for ordinals cer- 
tain well-known laws, such as the commutative laws for addition and 
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multiplication, that hold for other familiar number systems. In fact, these 
laws fail for ordinals, as the following examples show. 


Examples 


1. (o)p)@ +, B #B +, a) 


O+,1=0'>,@ 


2. (Aa)(AB)(o x, B # B x, a) 


OX, 2=OX, (1+, 1) =(@X, Ds, (@ Xo 1) = Ot, O > O 


3. (Ja)(AB)AYC +. B) Xo ¥ F (@ Xo Y) Fo(B Xo Y) 


(1+, 1)x, 9=2x,@=@ 
(1x, @) +) (1x, ®) =@+, O >, @ 


4. (4a)(AB)(Ay)(exp(@ x, B, y) # exp(a, y) x. exp(B, ¥)) 


exp(2x, 2,@) = exp(4, ) = U exp(4,a) =@ 


A<o@ 


exp(2, ) = U exp(2,a) =@ 


So, exp(2, @) x,exp(2, @) = @ x, @ >, @. 


Given any wf 4 of formal number theory S (see Chapter 3), we can associ- 
ate with .va wf .4* of NBG as follows: first, replace every “+” by “+,,” every 
by “x,,” and every “ fi'(t)” by “t U {f’*; then, if. zis “> 7 or 77, respectively, 
and we have already found 7* and 7%, let .7* be 7* > 7* or 7 7%, respectively; 
if zis (Vx)~(x), replace it by (Vx)(x € o > 7*(x)). This completes the definition 
of .4*. Now, if x1, ..., x, are the free variables (if any) of .4, prefix (x; EMA... A 
Xx, € ®) > to .%, obtaining a wf .7#. This amounts to restricting all variables 


“wn 


* In abbreviated notation for S, “f(t” is written as t’, and in abbreviated notation in NBG, 
“4 U {t}” is written as t’. So, no change will take place in these abbreviated notations. 
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to w and interpreting addition, multiplication and the successor function on 
natural numbers as the corresponding operations on ordinals. Then every 
axiom .7 of S is transformed into a theorem .7# of NBG. (Axioms (51)-(S3) 
are obviously transformed into theorems, (S4) # is a theorem by Proposition 
4.10(c), and (S5)#-(S8)# are properties of ordinal addition and multiplica- 
tion.) Now, for any wf of S, .z# is predicative. Hence, by Proposition 4.4, all 
instances of (S9)# are provable by Proposition 4.13(0). (In fact, assume .7#(@) A 
(Vx)(x € @ > ( F#(x) > F#(x’))). Let X = {y|y € @ A #(y)}. Then, by Proposition 
4.13(c), (Vx)(x € @ > .7#(x)).) Applications of modus ponens are easily seen to 
be preserved under the transformation of .7 into .7#. As for the generaliza- 
tion rule, consider a wf .(x) and assume that .7#(x) is provable in NBG. But 
Z#(x) is of the formxE@AY, E@A...AY, €@ > 7*(x). Hence, y, EWA...A Y, 
Ea > (Vax € @ > .7*(X) is provable in NBG. But this wf is just (Vx). 4(x))#. 
Hence, application of Gen leads from theorems to theorems. Therefore, for 
every theorem 7 of S, .z# is a theorem of NBG, and we can translate into 
NBG all the theorems of S proved in Chapter 3. 

One can check that the number-theoretic function h such that, if x is the 
Gédel number of a wf .Zof S, then h(x) is the G6del number of .7#, and if x is 
not the Gédel number of a wf of S, then h(x) = 0, is recursive (in fact, primi- 
tive recursive). Let K be any consistent extension of NBG. As we saw above, 
if x is the Gddel number of a theorem of S, then h(x) is the Gédel number of 
a theorem of NBG and, hence, also a theorem of K. Let S(K) be the extension 
of S obtained by taking as axioms all wfs of the language of S such that # 
is a theorem of K. Since K is consistent, S(K) must be consistent. Therefore, 
since S is essentially recursively undecidable (by Corollary 3.46), S(K) is 
recursively undecidable. Now, assume K is recursively decidable; that is, the 
set T, of Gddel numbers of theorems of K is recursive. But Cr) (X) = Cx (A(x) 
for any x, where Cz,,,, and Cx, are the characteristic functions of T,(K) and Tx. 
Hence, Ts) would be recursive, contradicting the recursive undecidability 
of S(K). Therefore, K is recursively undecidable, and thus, if NBG is consis- 
tent, NBG is essentially recursively undecidable. Recursive undecidability of 
a recursively axiomatizable theory implies incompleteness (see Proposition 
3.47). Hence, NBG is also essentially incomplete. Thus, we have the following 
result: if NBG is consistent, then NBG is essentially recursively undecidable and 
essentially incomplete. (It is possible to prove this result directly in the same 
way that the corresponding result was proved for S in Chapter 3.) 


Exercise 


4.68 Prove that a predicate calculus with a single binary predicate letter is 
recursively undecidable. [Hint: Use Proposition 3.49 and the fact that 
NBG has a finite number of proper axioms.] 


There are a few facts about the “cardinal arithmetic” of ordinal numbers 
that we would like to deal with now. By “cardinal arithmetic” we mean 
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properties connected with the operations of union (J), Cartesian product 
(x) and X*, as opposed to the properties of +,, x,, and exp. Observe that x is 
distinct from x,; also notice that ordinal exponentiation exp(a, B) has nothing 
to do with X”, the class of all functions from Y into X. From Example 4 on 
page 276 we see that exp(2, w) is w, whereas, from Cantor’s theorem, w < 2°, 
where 2° is the set of functions from o into 2. 


Proposition 4.37 


aoFoxo2o 
b.F 25 XA2KYS>XUYXXxY 
c. F Den(x) A Den(y) > Den(x U y) 


Proof 


a. Let fbe a function with domain such that, if « € w, then f’a = (a, @). 
Then f is a one-one function from @ into a subset of m x . Hence, w 
< x w. Conversely, let g be a function with domain ® x such that, 
for any (a, B) in @ x , g’(a, B) = exp(2, a) x, exp(3, B). We leave it as 
an exercise to show that g is a one-one function from x ® into a. 
Hence, x @ < @. So, by Bernstein's theorem, o x @ = @. 

b. Assume 4, € X, a, € X, a, # a,b, € Y, b, € Y, b, # by. Define 


(a,,b,) ifxe xX 
fx=)(u,x) ifxeY-Xandx#b, 
(ay,b.) ifx=b, andxeY-X 


Then f is a one-one function with domain X U Y and range a subset 
of X x Y. Hence, XU Y< Xx Y. 

. Assume Den(x) and Den(y). Hence, each of x and y contains at least 
two elements. Then, by part (b), x Uy <x x y. Butx2@andy 2a. 
Hence, x x y = x w. Therefore, x U y < @ x w. By Proposition 4.30, 
either Den(x U y) or Fin(x U y). But x € x U y and Den(x); hence, 
AFin(x U y). 


ie) 


For the further study of ordinal addition and multiplication, it is quite useful 
to obtain concrete interpretations of these operations. 


Proposition 4.38 (Addition) 


Assume that (7, x) is similar to (E,, «), that (s, y) is similar to (E,, B), and that 
xny = @. Let t be the relation on x U y consisting of all (u, v) such that 
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(u, V0) EXxXYOrueExAVEXAU,V) ErorueyAVEYA UU, v) ES (ie, Lis 
the same as r in the set x, the same as s in the set y, and every element of x 
t-precedes every element of y). Then t is a well-ordering of x Uy, and (i, x Uy) 
is similar to (Eas, +o B). 


Proof 


First, it is simple to verify that t is a well-ordering of x U y, since r is a well- 
ordering of x and s is a well-ordering of y. To show that (t, x U y) is similar 
to (E,+,p,0 + B), use transfinite induction on B. For B = @, y = @. Hence, t = 1, 
xUy=x,and a+, B =a. So, (¢, 0 U B) is similar to (Ey.,5,+5 B). Assume the 
proposition for y and let B = y’. Since (s, y) is similar to (E,, B), we have a 
function f with domain y and range such that, for any u, v in y, (u, 0) €s if 
and only if fu € f’v. Let b =(f)'y, let y, = y -{b} and let s; =s5 /N (y, x y,). Since 
b is the s-maximum of y, it follows easily that s, well-orders y,. Also, y,ff is 
a similarity mapping of y, onto y. Let t; = tm (x U y,) x (x U y,)). By induc- 
tive hypothesis, (t,, x U y;) is similar to (E.,,,,0+ 7), by means of some 
similarity mapping g with domain x U y; and range a +, y. Extend g to g, = 
g VU {(b, «+, yy}, which is a similarity mapping of x U y onto (@ +, y)’ =a +, 
y’ =a +, B. Finally, if Lim(f) and our proposition holds for all t <, B, assume 
that f is a similarity mapping of y onto B. Now, for each t <, f, let y, =(f)'t, 
s,=sn(y, xy), and t,=tn (x Uy) x (x Uy). By inductive hypothesis 
and Corollary 4.18(b), there is a unique similarity mapping g, of (t,, x U y,) 
with (E,,,.,+ T); also, if t, <, tT) <, B, then, since (x U y,,) [9m is a similar- 
ity mapping of (t,,,xUy,) with (E..,,,4+.) and, by the uniqueness of 
Jur (XUYu) £9 =Gxi that is, J is an extension of 9u. Hence, if g=Ur<p9: 
and A =U,<,5( +o 7), then g is a similarity mapping of (t, U:<. (¥U yx)) with 
(E, A). But, Ur.g(xU yr) = xUy and U,<,3(& + T) = & +5 B. This completes the 
transfinite induction. 


Proposition 4.39 (Multiplication) 


Assume that (7, x) is similar to (E,, «) and that (s, y) is similar to (E,, B). Let 
the relation f on x x y consist of all pairs ((u, v), (w, z)) such that u and w are 
in x and v and z are in y, and either (v, z) € s or (V=Z A (u, Ww) E71). Thent isa 
well-ordering of x x y and (f, x x y) is similar to (Exp, &%o B)* 


Proof 


This is left as an exercise. Proceed as in the proof of Proposition 4.38. 


* The ordering ¢ is called an inverse lexicographical ordering because it orders pairs as follows: 
first, according to the size of their second components and then, if their second components 
are equal, according to the size of their first components. 


280 Introduction to Mathematical Logic 


Examples 


1.2 x, @ =o. Let (7%, x) = (E,, 2) and (s, y) = (E,, o). Then the Cartesian 
product 2 x w is well-ordered as follows: (@, @), (1, @), (@, 1), (1, 1), 
(@, 2), (1, 2), ..., (D, n), (1, n), (BW, n + 1), (n+ 1), ... 

2. By Proposition 4.34(a), 2 = 1’! = 1 +, 1. Then by Proposition 4.35(,a), 
® X,2= (0 x, 1) +, x, 1) =@ +, o. Let (7, x) = (E,, @) and (s, y) = (E,, 2). 
Then the Cartesian product x 2 is well-ordered as follows: (@, @), 
(1, ©), (2, B), ..., (, 1), «1, 1), (2, 1), «.- 


Proposition 4.40 
For all a, w, x 0, & 0,- 


Proof 


(Sierpinski, 1958) Assume this is false and let « be the least ordinal such that 
@, X @, = @, is false. Then @, x @, & @, for all B <, a. By Proposition 4.37(a), 
a« >, ©. Now let P = @, x @, and, for B <, @,, let Ps = {(y, 5)|y +, 5 = By. First 
we wish to show that P =Us.<.«, Ps. Now, if y +, 8 = B <, @, then y <, B <, 
@, and 5 <, B <, @, hence, (y, 5) € @, x @, = P. Thus, Ug.,o, Ps c P. To show 
that P CUs..0, , it suffices to show that, if y <, o, and 6 <, @,, then y +, 6 
<, ®,- This is clear when y or 6 is finite. Hence, we may assume that y and 
5 are equinumerous with initial ordinals , <, y and @, <, 6, respectively. 
Let ¢ be the larger of o and p. Since y <, @, and 5 <, @,, then @, <, @,. Hence, 
by the minimality of a, @, x @, & @;,. Let x = y x {@} and y = 5 x {1}. Then, by 
Proposition 4.38, x Uy &y +, 6. Since y = o, and 6 = @,, x = @, x {@} and y = 
@, x {1}. Hence, since xx Ny = @, x Uy & (@, x {@}) U (@, x {1)). But, by Proposition 
4.37(b), (o, x {B}) U (@, x {1}) < @, x {B}) x (@, x 1) = @, x w, < @ x @ & @. 
Hence, y +, 5 X @ <, @,. It follows that y +, 8 <, @,. (If ©, <, y +, 6, then @, < 
Wz. Since We <o Wy We X W,- So, by Bernstein’s theorem, , = Or, contradicting 
the fact that , is an initial ordinal.) Thus, P = Us... P%- Consider P, for any 
B <, ®,. By Proposition 4.34(f), for each y <, B, there is exactly one ordinal 
5 such that y +, 5 = B. Hence, there is a similarity mapping from f’ onto P,, 
where P, is ordered according to the size of the first component y of the pairs 
(y, 5). Define the following relation R on P. For any y <, @, 5 <, @y H <p Oy V 
<o My ((y, 5), (H, V)) € Rif and only if either y +,5<,p+,vor(y+,5=p+ va 
Y <, »). Thus, if B, <, B. <p @,, then the pairs in P;, R-precede the pairs in P,,, 
and, within each P,, the pairs are R-ordered according to the size of their 
first components. One easily verifies that R well-orders P. Since P = @, x @,, 
it suffices now to show that (R, P) is similar to (Eq, ,@,). By Proposition 4.19, 
(R, P) is similar to some (E,, &), where € is an ordinal. Hence, P = & Assume 
that § >, @,. There is a similarity mapping f between (E,, &) and (R, P). Let 
b = f'@,; then b is an ordered pair (y, 5) with y <, @,, 5 <, @, and , J fis a 
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similarity mapping between (E,,,,@_) and the R-segment Y = Seg,(P, (y, 5)) of 
P determined by (y, 5). Then Y = ,. If we let B = y +, 6, then, if (o, p) € Y, we 
have o +, Pp <, y +, 5 = f; hence, o <, B and p <, 6. Therefore, Y C p’ x B’. But 
B’ <, @,. Since B is obviously not finite, B’ Y , with p <, a. By the minimality 
of a, @, X O, ¥ O,. SO, @, = Y X w,, contradicting w, < ,. Thus, € <, , and, 
therefore, P < w,. Let h be the function with domain a, such that h’B = (6, @) 
for every B <, ®,. Then h is a one-one correspondence between @, and the 
subset , x {@} of P and, therefore, , < P. By Bernstein’s theorem, a, = P, 
contradicting the definition of «. Hence, , x ws ¥ @, for all B. 


Corollary 4.41 


If x =o, and y = @,, and if y is the maximum of « and B, then x x y ¥ w, and x 
Uy &@,. In particular, @, x @, = @,. 


Proof 


By Propositions 4.40 and 4.37(b), @, <x Uy XX x Y = @, X Op, XO, X O, = O, 
Hence, by Bernstein’s theorem, x x y¥@,and x Uy = @,. 


Exercises 


4.69 Prove that the following are theorems of NBG. 
a XX, > XU, = 0, 
b. Wy +. Wy & Wy 
Cc OGFEXXO,>XX WO, =O, 
d. @F¥X<0> ,) = Hy 
4.70 Prove that the following are theorems of NBG 
a. 7, x 7@,) = -7@,) 
b. xX 7@,) > xU 70, = .7@,) 
c OGEXX 70, >xx AO,) = 7@,) 
d. @#F¥x<0,> (4) 2 7@,) 
e 1<x<@, > xo" = (@,)°* = (Y@,))° = .7@,)- 
4.71 Assumey#@Ay&y +.y. (This assumption holds for y = o, by Corollary 


4.41 and for y = .7(@,) by Exercise 4.70(b). It will turn out to hold for all 
infinite sets y if the axiom of choice holds.) Prove the following proper- 


Xx 
Xx 


ties of y. 
a. Inf(y) 
b. y2lty 


c. (auy(svxy(y=uUvAUuNv=GAULYAVEY) 
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d. {zizCyAz2=y= ly) 
e. {z|z Cy A Inf} = 7) 
£ GN YSyA (Wu y= fueu) 

4.72 Assume y = y x y A 1 < yy. (This holds when y = , by Proposition 4.40 
and for y = .7(,) by Exercise 4.70(a). It is true for all infinite sets y if the 
axiom of choice holds.) Prove the following properties of y. 

a YSyrtey 
b.P Let Perm(y) denote {f | y = y}. Then Perm(y) = -/(y). 


4.5 The Axiom of Choice: The Axiom of Regularity 


The axiom of choice is one of the most celebrated and contested statements 
of the theory of sets. We shall state it in the next proposition and show its 
equivalence to several other important assertions. 


Proposition 4.42 


The following wfs are equivalent. 


a. Axiom of choice (AC). For any set x, there is a function f such that, for 
any nonempty subset y of x, f'y € y. (fis called a choice function for x.) 

b. Multiplicative axiom (Mult). If x is a set of pairwise disjoint nonempty 
sets, then there is a set y (called a choice set for x) such that y contains 
exactly one element of each set in x: 


(Vul\uex>u4Da(VopvEexav4usSvnu=©@))> 
(Ay)(Vu)(u € x > (A1@)(@ Eur y)) 


. Well-ordering principle (WO). Every set can be well-ordered: (Vx)(Ay) 
(y We x). 

d. Trichotomy (Trich). (Vx)(Vy)(x < y V y < x)* 

e. Zorn’s Lemma (Zorn). Any nonempty partially ordered set x, in which 

every chain (ie., every totally ordered subset) has an upper bound, 

has a maximal element: 


ie) 


(Vx)(Vy)([(y Part x)A(Vu)(ucxay Totu> 
dv)vexa(Vw)weu>w=vv (u,v) €y)))|> 
(Gv)(v Eexa(Vw)(wex > (u,wW) €y))) 


— 


* This is equivalent to (Vx)(Vy)(x < y Vx = y V y < x), which explains the name “trichotomy” for 
this principle. 
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Proof 


1. k WO = Trich. Given sets x and y, then, by WO, x and y can be well- 
ordered. Hence, by Proposition 4.19, x 2 «and y & 6 for some ordinals 
a and B. But, by Exercise 4.52, « < B or B < a. Therefore, x < y or y X x. 

2. F Trich = WO. Given a set x, Hartogs’ theorem yields an ordinal « 
such that « is not equinumerous with any subset of x, that is, « < x is 
false. So, by Trich, x < a, that is, x is equinumerous with some subset 
y of a. Hence, by translating the well-ordering E, of y to x, x can be 
well-ordered. 

3. HK WO => Mult. Let x be a set of nonempty pairwise disjoint sets. By 
WO, there is a well-ordering R of ) x. Hence, there is a function f 
with domain x such that, for any u in x, fu is the R-least element of u. 
(Notice that u is a subset of L) x.) 

4. + Mult > AC. For any set x, we can define a one-one function g such 
that, for each nonempty subset u of x, g’u = u x {u}. Let x, be the range 
of g. Then x, is a set of nonempty pairwise disjoint sets. Hence, by 
Mult, there is a choice set y for x,. Therefore, if u is a nonempty subset 
of x, then u x {u} is in x,, and so y contains exactly one element (v, /) in 
u x {u}. Then the function f such that fu = v is a choice function for x. 

5. AC => Zorn. Let y partially order a nonempty set x such that 
every y-chain in x has an upper bound in x. By AC, there is a choice 
function f for x. Let b be any element of x. By transfinite induction 
(Proposition 4.14(a)), there is a function F such that F’@ = b and, for 
any « >, @, Fa is fu, where u is the set of y-upper bounds v in x of 
F’a such that v ¢ F’a. Let B be the least ordinal such that the set of 
y-upper bounds in x of F”B that are not in F”B is empty. (There must 
be such an ordinal. Otherwise, F would be a one-one function with 
domain On and range a subset of x, which, by the replacement axiom 
R, would imply that On is a set.) Let g = BF. Then it is easy to check 
that g is one-one and, if a <, y <, B, (g’a g'y ) € y. Hence, g"B is a 
y-chain in x; by hypothesis, there is a y-upper bound w of g’B. Since 
the set of y-upper bounds of F”BE g”B) that are not in g”B is empty, 
w € g"B and w is the only y-upper bound of g”B (because a set can 
contain at most one of its y-upper bounds). Hence, w is a y-maximal 
element. (If (w, z) € y and z € x, then z is a y-upper bound of g"B, 
which is impossible.) 

6. F Zorn > WO. Given a set z, let X be the class of all one-one func- 
tions with domain an ordinal and range a subset of z. By Hartogs’ 
theorem, X is a set. Clearly, @ € X. X is partially ordered by the 
proper inclusion relation Cc. Given any chain of functions in X, of 
any two, one is an extension of the other. Hence, the union of all the 
functions in the chain is also a one-one function from an ordinal 
into z, which is a Gupper bound of the chain. Hence, by Zorn, X has 
a maximal element g, which is a one-one function from an ordinal 


284 


Introduction to Mathematical Logic 


a into z. Assume z — g’a # @ and let b Ez — g’a. Let f= gU {< a, B >}. 
Then f € X and g Cf, contradicting the maximality of g. So, go = z. 
Thus, a=z. By means of g, we can transfer the well-ordering E, of « 


& 
to a well-ordering of z. 


Exercises 


4.73 Show that each of the following is equivalent to the axiom of choice. 


4.74 


a. 
b. 


Any set x is equinumerous with some ordinal. 


Special case of Zorn’s lemma. If x is a nonempty set and if the union 
of each nonempty C-chain in x is also in x, then x has a C-maximal 
element. 


Hausdorff maximal principle. If x is a set, then every C-chain in xis a 
subset of some maximal C-chain in x. 


Teichmiiller-Tukey Lemma. Any set of finite character has a C -maxi- 
mal element. (A nonempty set x is said to be of finite character if and 
only if: (i) every finite subset of an element of x is also an element of x; 
and (ii) if every finite subset of a set y is a member of x, then y € x.) 


(Vx\(Rel(x) > Gy)Enely) A (x) = 7(y) Ay Sx) 

For any nonempty sets x and y, either there is a function with 
domain x and range y or there is a function with domain y and 
range x. 


Show that the following finite axiom of choice is provable in NBG: if x 
is a finite set of nonempty disjoint sets, then there is a choice set y for x. 
[Hint: Assume x & a where « € w. Use induction on «.] 


Proposition 4.43 


The following are consequences of the axiom of choice. 


a. Any infinite set has a denumerable subset. 

b. An infinite set is Dedekind-infinite. 

c. If x is a denumberable set whose elements are denumerable sets, 
then |) x is denumerable. 


Proof 


Assume AC. 


a. Let x be an infinite set. By Exercise 4.73(a), x is equinumerous with 
some ordinal «. Since x is infinite, so is a. Hence, w <, a; therefore, 
® is equinumerous with some subset of x. 
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b. The proof is by part (a) and Exercise 4.64(¢). 

c. Assume x is a denumerable set of denumerable sets. Let f be a func- 
tion assigning to each u in x the set of all one-one correspondences 
between u and a. Let z be the union of the range of f Then, by AC 
applied to z, there is a function g such that g’v € v for each non- 
empty v C z. In particular, if u € X, then g‘(f'u) is a one-one cor- 
respondence between u and w. Let h be a one-one correspondence 
between @ and x. Define a function F on J x as follows: let y € LU x 
and let n be the smallest element of w such that y € h’n. Now, h'n € x; 
so, g'(f'(h’n)) is a one-one correspondence between hn and w. Define 
F'y = (n, (g'(f'(h'n))'y). Then F is a one-one function with domain [J x 
and range a subset of w x . Hence, J) x < @ x w. But @ x @ = and, 
therefore, L) x < w. Ifv Ex, thenv CU) x and v =o. Hence, a < Ux. 
By Bernstein’s theorem, ) x = o. 


Exercises 


4.75 If x is a set, the Cartesian product II,.,u is the set of functions f with 
domain x such that fu € u for all u € x. Show that AC is equivalent to 
the proposition that the Cartesian product of any set x of nonempty sets 
is also nonempty. 

4.76 Show that AC implies that any partial ordering of a set x is included in 
a total ordering of x. 

4.77 Prove that the following is a consequence of AC: for any ordinal a, if x is 
a set such that x < , and such that (Wu)(u € x > u <@,), then [J x < a,. 
[Hint: The proof is like that of Proposition 4.43(c).] 


4.78 a. Provey<x> f\(Fnc(f) AA(f) =x AAP) =y). 
b. Prove that AC implies the converse of part (a). 
4.792a. Prove (ut+,v &u? +, (2 x (ux 0) +, 07. 
b. Assume y is a well-ordered set such that x x y2x+,y and -(y < x). 


Prove that x < y. 


c. Assume y = y x y for all infinite sets y. Prove that, if Inf(x) and z= 7'x, 
thenx x z2x+,2Z. 


d. Prove that AC is equivalent to (Vy)(Inf(y) > y = y x y) (Tarski, 1923). 


A stronger form of the axiom of choice is the following sentence: (4X)(Fne(X) 
A (Wu)(tu # @ > X'u € u)). (There is a universal choice function (UCF)—ie., a 
function that assigns to every nonempty set u an element of u.) UCF obvi- 
ously implies AC, but W.B. Easton proved in 1964 that UCF is not provable 
from AC if NBG is consistent. However, Felgner (1971b) proved that, for any 
sentence .7in which all quantifiers are restricted to sets, if vis provable from 
NBG + (UCF), then .7 is provable in NBG + (AC). (See Felgner (1976) for a 
thorough treatment of the relations between UCF and AC.) 
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The theory of cardinal numbers can be simplified if we assume AC; for AC 
implies that every set is equinumerous with some ordinal and, therefore, 
that every set x is equinumerous with a unique initial ordinal, which can 
be designated as the cardinal number of x. Thus, the cardinal numbers would 
be identified with the initial ordinals. To conform with the standard nota- 
tion for ordinals, we let &, stand for ,. Proposition 4.40 and Corollary 4.41 
establish some of the basic properties of addition and multiplication of 
cardinal numbers. 

The status of the axiom of choice has become less controversial in recent 
years. To most mathematicians it seems quite plausible, and it has so many 
important applications in practically all branches of mathematics that not to 
accept it would seem to be a willful hobbling of the practicing mathemati- 
cian. We shall discuss its consistency and independence later in this section. 

Another hypothesis that has been proposed as a basic principle of set the- 
ory is the so-called regularity axiom (Reg): 


(VX\(X 4B => (Ay\ly e Xa ynX =@)) 


(Every nonempty class X contains a member that is disjoint from X.) 


Proposition 4.44 


a. The regularity axiom implies the Fundierungsaxiom: 
=(3f)Fne(f)a 7(f)=oa(vu)(ueo= f'(u'Je f'u)) 


that is, there is no infinitely descending €-sequence xy > x; 3 X, >... 
b. If we assume AC, then the Fundierungsaxiom implies the regularity 
axiom. 
c. The regularity axiom implies the nonexistence of finite €-cycles— 
that is, of functions f on a nonzero finite ordinal « such that f'@ € f’1 
€... € fa €f@. In particular, it implies that there is no set y such that 


YEY. 
Proof 


a. Assume Fne(f) A7(f) = @ A (Wutu € @ > fw’) € fu). Let z = f’o. By 
(Reg), there is some element y in z such that y n z = @. Since y € z, 
there is a finite ordinal « such that y = fa. Then f(a’) € y N z, contra- 
dicting yn z=@. 

b. First, we define the transitive closure TC(u) of a set u. Intuitively, we 
want TC(u) to be the smallest transitive set that contains u. Define by 
induction a function g on @ such that g’@ = {u} and g’(@’) = U (g’a) 
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for each win @. Thus, g/l = u, g’2 = u, g’3 = U (U u), and so on. Let 
TC(u) = U (g"@) be called the transitive closure of u. For any u, TC(u) 
is transitive; that is, (Vv)(v € TC) > v C TC(u)). Now, assume AC and 
the Fundierungsaxiom; also, assume X # @ but there is no y in X 
such that yn X = @. Let b be some element of X; hence, bn X # @. Let 
c = TC(b) n X. By AC, let h be a choice function for c. Define a func- 
tion f on w such that f’@ = b and, for any o in a, f(a’) = h'(fa) n X). 
It follows easily that, for each a in o, f(a’) € fa, contradicting the 
Fundierungsaxiom. (The proof can be summarized as follows: we 
start with an element b of X; then, using h, we pick an element f’1 in 
bn X; since, by assumption, f’1 and X cannot be disjoint, we pick an 
element f’2 in f’'1 N X, and so on.) 

. Assume given a finite €-cycle: fO Efle...efn Ef @. Let X be the 
range of f{f'@, f’1, ..., f'n}. By (Reg), there is some fj in X such that fj 
n X = @. But each element of X has an element in common with X* 


a 


Exercises 


4.80 If zis a transitive set such that u € z, prove that TC(u)  z. 


4.81 By the principle of dependent choices (PDC) we mean the following: if 
ris anonempty relation whose range is a subset of its domain, then 
there is a function f: # > (7) such that (Vu € a > (fu, fu’) En 
(Mostowski, 1948). 


a. Prove AC > PDC. 
b. Show that PDC implies the denumerable axiom of choice (DAC): 


Den(x)A(Vu)(uex S>u4O)> (ff :x> Uxa(Vu)(uex=> fuen)) 


c. Prove PDC= (Vx)(Inf(x) > w < x) (Hence, by Exercise 4.64(c), PDC 
implies that a set is infinite if and only if it is Dedekind-infinite.) 


d. Prove that the conjunction of PDC and the Fundierungsaxiom 
implies (Reg). 


Let us define by transfinite induction the following function ¥ with domain On: 
W'S =O 
Y'(a')= 7(¥'a) 
Lim(A) > ¥A= U WB 


B<oA 


* The use of AC in deriving (Reg) from the Fundierungsaxiom is necessary. Mendelson (1958) 
proved that, if NBG is consistent and if we add the Fundierungsaxiom as an axiom, then 
(Reg) is not provable in this enlarged theory. 
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Let H stand for U (¥”On), that is, H consists of all members of sets of the form 
Pa. Let H, stand for P’(’). Thus, H, = .7(¥’B) and Hy = .7(¥'(B’)) = 7 (A,). In 
particular, H, = .(\Y’@) = .“(@) = {O}, H, = “(A®) = (tO}) = {@, {Oj}, and H, = 7 
(H,) = IG, {O}}) = {, {SO}, UDI}, {S, {OH}. 

Define a function p on H such that, for any x in H, p’x is the least ordinal « 
such that x € P’a. p’x is called the rank of x. Observe that p’x must be a suc- 
cessor ordinal. (In fact, there are no sets of rank @, since ‘¥’@ = @. If Vis a limit 
ordinal, every set in ¥’A already was a member of P’B for some B <, A.) As 
examples, note that p’@ = 1, p'{@} = 1, p'{@, {@}} = 2, and p’{{@}} = 2. 


Exercise 


4.82 Prove that the following are theorems of NBG. 
a. (Va) Trans (Y’a) 


b. Trans(H) 

c. (Va)(P’a € Pa’) 

d. (Va)(VP)(a <, B > Pa C P’B) 

e. OnCH 

f. (Va)(p’a = a’) 

g. (Wu(VW)uUEHAvEHAUEV=> pu< pv) 
h. (Wuw(ucCH>ue 8H) 


Proposition 4.45 


The regularity axiom is equivalent to the assertion that V = H, that is, that 
every set is a member of H. 


Proof 


a. Assume V = H. Let X # @. Let a be the least of the ranks of all the 
members of X, and let b be an element of X such that p’b = a. Then bn 
X = @; for, if u € bn X, then, by Exercise 4.82(g), p’u € p’b = a, contra- 
dicting the minimality of «. 

b. Assume (Reg). Assume V # H. Then V — H # @. By (Reg), there is 
some y in V — H such that y n (V — H) = @. Hence, y € H and so, by 
Exercise 4.82(h), y € H, contradicting y € V — H. 


Exercises 


4.83 Show that (Reg) is equivalent to the special case: (Vx)(x 4 @ > (Ay\(y € 
xAYNX=@)). 
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4.84 Show that, if we assume (Reg), then Ord(X) is equivalent to Trans(X) A 
E ConX, that is, to the wf 


(Vu\ueX SucX)a(Vul(Vo)\uEeX ave XAUFUDSUEVVIVEN) 


Thus, with the regularity axiom, a much simpler definition of the notion of 
ordinal class is available, a definition in which all quantifiers are restricted 
to sets. 


4.85 Show that (Reg) implies that every nonempty transitive class contains @ 


Proposition 4.45 certainly increases the attractiveness of adding (Reg) as 
a new axiom to NBG. The proposition V = H asserts that every set can be 
obtained by starting with @ and applying the power set and union oper- 
ations any transfinite number of times. The assumption that this is so is 
called the iterative conception of set. Many set theorists now regard this con- 
ception as the best available formalization of our intuitive picture of the 
universe of sets.* 

By Exercise 4.84, the regularity axiom would also simplify the definition 
of ordinal numbers. In addition, we can develop the theory of cardinal num- 
bers on the basis of the regularity axiom; namely, just define the cardinal 
number of a set x to be the set of all those y of lowest rank such that y & x. 
This would satisfy the basic requirement of a theory of cardinal numbers, 
the existence of a function Card whose domain is V and such that (Vx)(Vy) 
(Card’x = Card’y = x = y). 

There is no unanimity among mathematicians about whether we have suffi- 
cient grounds for adding (Reg) as a new axiom, for, although it has great sim- 
plifying power, it does not have the immediate plausibility that even the axiom 
of choice has, nor has it had any mathematical applications. Nevertheless, it is 
now often taken without explicit mention to be one of the axioms. 

The class H determines an inner model of NBG in the following sense. For any 
wf .7 (written in unabbreviated notation), let Rel,,(4) be the wf obtained from 
2 by replacing every subformula (VX)7(X) by (VX)(X € H => 7(X)) (in mak- 
ing the replacements we start with the innermost subformulas) and then, if 7 
contains free variables. Y,, ..., Y,, prefixing (Y, CH AY, CHA... AY, GH) >. 

In other words, in forming Rel,; (4), we interpret “class” as “subclass of H.” 
Since M(X) stands for (4Y)(X € Y), Rel,(M(X)) is @Y)(Y C H a X € Y), which 
is equivalent to X € H; thus, the “sets” of the model are the elements of H. 
Hence, Rel,, ((Vx).4) is equivalent to (Vx\(x € H > #), where # is Rel,(.4. 
Note also thatt X CHA Y CH = [Rel,({X = Y) © X = Y]. Then it turns out 
that, for any theorem .7 of NBG, Rel; (4) is also a theorem of NBG. 


* The iterative conception seems to presuppose that we understand the power set and union 
operations and that ordinal numbers (or something essentially equivalent to them) are avail- 
able for carrying out the transfinite iteration of the power set and union operations. 
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Exercises 
4.86 Verify that, for each axiom 7 of NBG, yg Rel;(.4). If we adopt a 


4.87 


4.88 


4.89 


4.90 


semantic approach, one need only show that, if .7is a model for NBG, 
in the usual sense of “model,” then the objects X of . /that satisfy the wf 
X € Halso form a model for NBG. In addition, one can verify that (Reg) 
holds in this model; this is essentially just part (a) of Proposition 4.45. 
A direct consequence of this fact is that, if NBG is consistent, then so is 
the theory obtained by adding (Reg) as anew axiom. That (Reg) is inde- 
pendent of NBG (that is, cannot be proved in NBG) can be shown by 
means of a model that is somewhat more complex than the one given 
above for the consistency proof (see Bernays, 1937-1954, part VII). Thus, 
we can consistently add either (Reg) or its negation to NBG, if NBG is 
consistent. Practically the same arguments show the independence and 
consistency of (Reg) with respect to NBG + (AC). 


Consider the model whose domain is H, and whose interpretation of 
€ is Ey,, the membership relation restricted to H,. Notice that the “sets” 
of this model are the sets of rank <, wand the “proper classes” are the sets 
of rank a’. Show that the model H, satisfies all axioms of NBG (except 
possibly the axioms of infinity and replacement) if and only if Lim(a). 
Prove also that H, satisfies the axiom of infinity if and only if «>, o. 


Show that the axiom of infinity is not provable from the other axioms 
of NBG, if the latter form a consistent theory. 


Show that the replacement axiom (R) is not provable from the other axi- 
oms (T, P, N, (B1)-(B7), U, W, S) if these latter form a consistent theory. 


An ordinal « such that H, is a model for NBG is called inaccessible. Since 
NBG has only a finite number of proper axioms, the assertion that « is 
inaccessible can be expressed by the conjunction of the relativization 
to H, of the proper axioms of NBG. Show that the existence of inacces- 
sible ordinals is not provable in NBG if NBG is consistent. (Compare 
Shepherdson (1951-1953), Montague and Vaught (1959), and, for 
related results, Bernays (1961) and Levy (1960).) Inaccessible ordinals 
have been shown to have connections with problems in measure theory 
and algebra (see Ulam, 1930; Zeeman, 1955; Erdés and Tarski, 1961).* 
The consistency of the theory obtained from NBG by adding an axiom 
asserting the existence of an inaccessible ordinal is still an open ques- 
tion. More about inaccessible ordinals may be found in Exercise 4.91. 


The axiom of choice turns out to be consistent and independent with 
respect to the theory NBG + (Reg). More precisely, if NBG is consistent, AC 
is an undecidable sentence of the theory NBG + (Reg). In fact, Gédel (1938, 


* Inaccessible ordinals are involved also with attempts to provide a suitable set-theoretic founda- 
tion for category theory (see MacLane, 1971; Gabriel, 1962; Sonner, 1962; Kruse, 1966; Isbell, 1966). 
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1939, 1940) showed that, if NBG is consistent, then the theory NBG + (AC) + 
(Reg) + (GCH) is also consistent, where (GCH) stands for the generalized con- 
tinuum hypothesis: 


(vx)(InF(x) => 3(3y)(x<yay <7 (x))) 


(Our statement of Gédel’s result is a bit redundant, since Fypg (GCH) > 
(AC) has been proved by Sierpinski (1947) and Specker (1954). This result 
will be proved below.) The unprovability of AC from NBG + (Reg), if NBG is 
consistent, has been proved by PJ. Cohen (1963-1964), who also has shown 
the independence of the special continuum hypothesis, 2m = w,, in the theory 
NBG + (AC) + (Reg). Expositions of the work of Cohen and its further devel- 
opment can be found in Cohen (1966) and Shoenfield (1971b), as well as in 
Rosser (1969) and Felgner (1971a). For a thorough treatment of these results 
and other independence proofs in set theory, Jech (1978) and Kunen (1980) 
should be consulted. 

We shall present here a modified form of the proof in Cohen (1966) of 
Sierpinski’s theorem that GCH implies AC. 


Definition 


For any set z, let .°v) = v, (0) = 7), 70) = AAO), 4 71) = ACH) 
for all kino. 


Lemma 4.46 


If w < v, then .7*(v) +,.7*(0) & (0) for all k >, 1. 


Proof 


Remember that .7(x) & 2* (see Exercise 4.40). From  < v we obtain w < .7*(v) 
for all k in w. Hence, ./*(v) +, 1 = ./*(v) for all k in w, by Exercise 4.64(g). Now, 
for any k>, 1, 


#*(0) +, 7*(v) = 7*(v)x2= (70) x2 227" x2 
2 


~ A) x2! ~ 27" @)te1 ~ a) ~ 7( 7*1(0)) = 7*(0) 


Lemma 4.47 


Ify+.x = 7x +, x), then .7(x) < y. 
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Proof 


Notice that .7(x + .x) = 2** ¥ = 2* x 2* = (x) x (x). Let y* = y x {@} and x* = 
x x {I}. Since y+. x = 7X +, x) & A(x) x .7(0), there is a function f such that 
y* Ux" =.(x)x.7(x). Let h be the function that takes each u in x* into the first 


component of the pair f’u. Thus, h: x* > .7(x). By Proposition 4.25(a), there 
must exist c in .7(x) — h"(x*). Then, for all z in (x), there exists a unique v in 
y* such that f’v = (c, z). This determines a one-one function from .7(x) into y. 
Hence, .7(x) X y. 


Proposition 4.48 


Assume GCH. 


a. If u cannot be well-ordered and u +, u = u and f is an ordinal such 
that B < 2", then B < u. 
b. The axiom of choice holds. 


Proof 


a. Notice that u +,u 2 u implies 1+, u  u, by Exercise 4.71(b). Therefore, 
by Exercise 4.55(i), 2" +, u & 2". Now, u < B +, u = 2". By GCH, either 
(i) u=B+,uor (i) B +.u & 2". If Gi) holds, B+.u 2 2"+.u 2 Plu +, u). 
Hence, by Lemma 4.47, Pu) < B and, therefore, u < 8. Then, since 
u would be equinumerous with a subset of an ordinal, u could be 
well-ordered, contradicting our assumption. Hence, (i) must hold. 
But then, B<B+.u2u. 

b. We shall prove AC by proving the equivalent sentence asserting that 
every set can be well-ordered (WO). To that end, consider any set 
x and assume, for the sake of contradiction, that x cannot be well- 
ordered. Let v = 2*Uw. Then @ < x U  X v. Hence, by Lemma 4.46, 
70) +. 70) = 70) for all k >, 1. Also, since x X x UW XU <.7(V) < 
AA0)) <...,and x cannot be well-ordered, each .(v) cannot be well- 
ordered, for k >, 0. Let B = 7 'v. We know that B < 74) by Corollary 
4.32. Hence, by part (a), with u = .7 3(v), we obtain B < .7 3(v). Using 
part (a) twice more (successively with u = 7 *(v) and u = (v)), we 
obtain 7’v =f X v. But this contradicts the definition of 7’v as the 
least ordinal not equinumerous with a subset of v. 


Exercise 


4.91 An a-sequence is defined to be a function f whose domain is a. If the 
range of f consists of ordinals, then fis called an ordinal a-sequence and, 
if, in addition, B <, y <, a implies f’B <, f’y, then fis called an increasing 
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ordinal a-sequence. By Proposition 4.12, if f is an increasing ordinal 
a-sequence, then |) (f’a) is the least upper bound of the range of f, An 
ordinal 6 is said to be regular if, for any increasing ordinal a-sequence 
such that « <, 6 and the ordinals in the range of fare all <, 6, U (fa) + (1<,8. 
Nonregular ordinals are called singular ordinals. 


a. 


b 
c. 
d. 
e 


ph 


Which finite ordinals are regular? 

Show that w, is regular and @, is singular 

Prove that every regular ordinal is an initial ordinal. 

Assuming the AC, prove that every ordinal of the form @,,. :is regular. 
If w, is regular and Lim(o), prove that o, = a. (A regular ordinal a, 
such that Lim(q) is called a weakly inaccessible ordinal.) 

Show that, if @, has the property that y <, @, implies (y) < @,, then 
Lim(a). The converse is implied by the generalized continuum 
hypothesis. A regular ordinal , such that « >, @ and such that 
Y <, ®, implies Ay) < @,, is called strongly inaccessible. Thus, every 
strongly inaccessible ordinal is weakly inaccessible and, if (GCH) 
holds, the strongly inaccessible ordinals coincide with the weakly 
inaccessible ordinals. 

(i) If y is inaccessible (ie., if H, is a model of NBG), prove that y is 
weakly inaccessible. (ii)? In the theory NBG + (AC), show that y is 
inaccessible if and only if y is strongly inaccessible (Sheperdson, 
1951-1953; Montague and Vaught, 1959). 

If NBG is consistent, then in the theory NBG + (AC) + (GCH), show 
that it is impossible to prove the existence of weakly inaccessible 
ordinals. 


( 


4.6 Other Axiomatizations of Set Theory 


We have chosen to develop set theory on the basis of NBG because it is rela- 
tively simple and convenient for the practicing mathematician. There are, of 
course, many other varieties of axiomatic set theory, of which we will now 
make a brief survey. 


4.6.1 Morse—Kelley (MK) 


Strengthening NBG, we can replace axioms (B1)—(B7) by the axiom schema: 


where 


(2D) YN (Vx(x EY & 4(x))s 


A(x) is any wf (not necessarily predicative) of NBG 
Y is not free in A(x) 
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The new theory MK, called Morse—Kelley set theory, became well-known 
through its appearance as an appendix in a book on general topology by 
Kelley (1955). The basic idea was proposed independently by Mostowski, 
Quine, and Morse (whose rather unorthodox system may be found in Morse 
(1965)). Axioms (B1)-(B7) follow easily from (CL) and, therefore, NBG is a 
subtheory of MK. Mostowski (1951a) showed that, if NBG is consistent, then 
MK is really stronger than NBG. He did this by constructing a “truth defi- 
nition” in MK on the basis of which he proved Fy “npe, Where “pe iS a 
standard arithmetic sentence asserting the consistency of NBG. On the other 
hand, by Gédel’s second theorem, “gc is not provable in NBG if the latter 
is consistent. 

The simplicity and power of schema ((]) make MK very suitable for use 
by mathematicians who are not interested in the subtleties of axiomatic set 
theory. But this very strength makes the consistency of MK a riskier gamble. 
However, if we add to NBG + (AC) the axiom (In) asserting the existence 
of a strongly inaccessible ordinal 0, then H, is a model of MK. Hence, MK 
involves no more risk than NBG + (AC) + (In). 

There are several textbooks that develop axiomatic set theory on the basis 
of MK (Rubin, 1967; Monk, 1980; Chuquai, 1981). Some of Cohen's indepen- 
dence results have been extended to MK by Chuquai (1972). 


Exercises 


4.92 Prove that axioms (B1)-(B7) are theorems of MK. 


4.93 Verify that, if @ is a strongly inaccessible ordinal, then H, is a model 
of MK. 


4.6.2 Zermelo—Fraenkel (ZF) 


The earliest axiom system for set theory was devised by Zermelo (1908). The 
objects of the theory are thought of intuitively as sets, not the classes of NBG 
or MK. Zermelo’s theory Z can be formulated in a language that contains 
only one predicate letter €. Equality is defined extensionally: x = y stands for 
(Vz)(z €x =z € y). The proper axioms are: 


T:x=y>(eezeoyes (substitutivity of =) 
P: Az\(Vulu €z @u=xvu=y) (pairing) 
N: Gx(Vy)\y € x) (null set) 
U: Gy(Wuu € y & (Av)u EvAVEX) (sum set) 
W: (Ay)(Wuyu € y Su Cx) (power set) 
S* GyWwuu ey &uEex a Au))), (selection) 
where .#(u) is any wf not containing y free 
I: (AI\@ Ex A (Vz\(z Ex > zU {z} Ex) (infinity) 


Here we have assumed the same definitions of C, @, U and {u} as in NBG. 
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Zermelo’s intention was to build up mathematics by starting with a few 
simple sets (@ and ) and then constructing further sets by various well- 
defined operations (such as formation of pairs, unions and power sets). In 
fact, a good deal of mathematics can be built up within Z. However, Fraenkel 
(1922a) observed that Z was too weak for a full development of mathemat- 
ics. For example, for each finite ordinal n, the ordinal o +, n can be shown to 
exist, but the set A of all such ordinals cannot be proved to exist, and, there- 
fore, @ +, @, the least upper bound of A, cannot be shown to exist. Fraenkel 
proposed a way of overcoming such difficulties, but his idea could not be 
clearly expressed in the language of Z. However, Skolem (1923) was able to 
recast Fraenkel’s idea in the following way: for any wf .4(x, y), let Fun(4) 
stand for (Vx)(Vu)(Vv)( A(x, u) A A(x, v) > u = v). Thus, Fun (%) asserts that .7 
determines a function. Skolem’s axiom schema of replacement can then be for- 
mulated as follows: 


(R*) Fun(7) => (Vw)(az)(Vv)(v €z & (Au)(ue war A(u,v))) 
for any wf .4(x,y) 


This is the best approximation that can be found for the replacement axiom 
R of NBG. 

The system Z + (R*) is denoted ZF and is called Zermelo—Fraenkel set the- 
ory. In recent years, ZF is often assumed to contain a set-theoretic regularity 
axiom (Reg*): x 4 @ > (Ay)\(y Ex AyN x = @). The reader should always check 
to see whether (Reg*) is included within ZF. ZF is now the most popular 
form of axiomatic set theory; most of the modern research in set theory on 
independence and consistency proofs has been carried out with respect to 
ZF. For expositions of ZF, see Krivine (1971), Suppes (1960), Zuckerman (1974), 
Lévy (1978), and Hrbacek and Jech (1978). 

ZF and NBG yield essentially equivalent developments of set theory. Every 
sentence of ZF is an abbreviation of a sentence of NBG since, in NBG, lower- 
case variables x, y, z, ... serve as restricted set variables. Thus axiom N is an 
abbreviation of (Ax)(M(x) A (Vy)(M(y) => y ¢ x)) in NBG. It is a simple mat- 
ter to verify that all axioms of ZF are theorems in NBG. Indeed, NBG was 
originally constructed so that this would be the case. We can conclude that, 
if NBG is consistent, then so is ZF. In fact, if a contradiction could be derived 
in ZF, the same proof would yield a contradiction in NBG. 

The presence of class variables in NBG seems to make it much more pow- 
erful than ZF. At any rate, it is possible to express propositions in NBG that 
either are impossible to formulate in ZF (such as the universal choice axiom) 
or are much more unwieldy in ZF (such as transfinite induction theorems). 
Nevertheless, it is a surprising fact that NBG is no riskier than ZF. An even 
stronger result can be proved: NBG is a conservative extension of ZF in the 
sense that, for any sentence .7of the language of ZF, if Fuge 4 then Fz .7 (see 
Novak (Gal) 1951; Rosser and Wang, 1950; Shoenfield, 1954). This implies that, 
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if ZF is consistent, then NBG is consistent. Thus, NBG is consistent if and only 
if ZF is consistent, and NBG seems to be no stronger than ZF. However, NBG 
and ZF do differ with respect to the existence of certain kinds of models (see 
Montague and Vaught, 1959). Moreover, another important difference is that 
NBG is finitely axiomatizable, whereas Montague (1961a) showed that ZF (as 
well as Z) is not finitely axiomatizable. Montague (1961b) proved the stronger 
result that ZF cannot be obtained by adding a finite number of axioms to Z. 


Exercise 


4.94 Let H* =UH, (see page 288). 
a. Verify that H * consists of all sets of rank less than «. 
b. If ais a limit ordinal >, w, show that H: is a model for Z. 


c.P Find an instance of the axiom schema of replacement (R*) that is 
false in ae o- (Hint: Let a(x, y) bex Eo A y = @ +, x. Observe that 
o+, ©¢ HG... anda +,0 =U {ol|Awu €o A Buu, v)}] 


d. Show that, if ZF is consistent, then ZF is a proper extension of Z. 


4.6.3 The Theory of Types (ST) 


Russell’s paradox is based on the set K of all those sets that are not members 
of themselves: K = {x|x ¢ x}. Clearly, K € K if and only if K ¢ K. In NBG this 
argument simply shows that K is a proper class, not a set. In ZF the conclu- 
sion is just that there is no such set K. 

Russell himself chose to find the source of his paradox elsewhere. He 
maintained that x € x and x ¢ x should be considered “illegitimate” and 
“ungrammatical” formulas and, therefore, that the definition of K makes no 
sense. However, this alone is not adequate because paradoxes analogous to 
Russell's can be obtained from slightly more complicated circular properties, 
likexeyAy ex. 


Exercise 


4.95 a. Derive a Russell-style paradox by using x ey Ay Ex. 


b. Usex Ey AY, € Yn A «ee A Yn © Yn AY, € X to Obtain a paradox, 
where n > 1. 


Thus, to avoid paradoxes, one must forbid any kind of indirect circularity. 
For this purpose, we can think of the universe divided up into types in the 
following way. Start with a collection W of nonsets or individuals. The ele- 
ments of W are said to have type 0. Sets whose members are of type 0 are 
the objects of type 1. Sets whose members are of typel will be the objects of 
type 2, and so on. 
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Our language will have variables of different types. The superscript of a 
variable will indicate its type. Thus, x° is a variable of type 0, y' is a variable 
of type 1, and so on. There are no variables other than type variables. The 
atomic wfs are of the form x" € y"*!, where n is one of the natural numbers 
0, 1, 2, .... The rest of the wfs are built up from the atomic wfs by means of 
logical connectives and quantifiers. Observe that -(x € x) and -(x Ey Ay EX) 
are not wfs. 

The equality relation must be defined piecemeal, one definition for each type. 


Definition 


x" = 1" for zit eau ez S n ez! 
y y 


Notice that two objects are defined to be equal if they belong to the same 
sets of the next higher type. The basic property of equality is provided by the 
following axiom scheme. 


4.6.3.1 ST1 (Extensionality Axiom) 


ne) n+1 n+1 


(Sx a" ey Sat eg Say Se 
This asserts that two sets that have the same members must be equal. On the 
other hand, observe that the property of having the same members could 
not be taken as a general definition of equality because it is not suitable for 
objects of type 0. 

Given any wf .4(x"), we wish to be able to define a set {x” 


Ax")\. 


4.6.3.2 ST2 (Comprehension Axiom Scheme) 


For any wf .4(x"), the following wf is an axiom: 


x 


ay"™")(vx")(x" ey" Og (x")] 


Here, y"*! is any variable not free in .7(x"). If we use the extensionality axiom, 
then the set y"*! asserted to exist by axiom ST2 is unique and can be denoted 
by {x"| Ac’). 

Within this system, we can define the usual set-theoretic notions and 
operations, as well as the natural numbers, ordinal numbers, cardinal num- 
bers and so on. However, these concepts are not unique but are repeated 
for each type (or, in some cases, for all but the first few types). For example, 
the comprehension scheme provides a null set A”*! = {x"|x" 4 x"} for each 
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nonzero type. But there is no null set per se. The same thing happens for 
natural numbers. In type theory, the natural numbers are not defined as they 
are in NBG. Here they are the finite cardinal numbers. For example, the set 
of natural numbers of type 2 is the intersection of all sets containing {A1} and 
closed under the following successor operation: the successor S(y) of a set 
y’? is fo |Guja2ueyazéulav! =u! vu {z})}. Then, among the natural 
numbers of type 2, we have 0 = {A4}, 1 = S(0), 2 = S(1), and so on. Here, the 
numerals 0, 1, 2, ... should really have a superscript “2” to indicate their type, 
but the superscripts were omitted for the sake of legibility. Note that 0 is the 
set of all sets of type 1 that contain no elements, 1 is the set of all sets of type 1 
that contain one element, 2 is the set of all sets of type 1 that contain two ele- 
ments, and so on. 

This repetition of the same notion in different types makes it somewhat 
inconvenient for mathematicians to work within a type theory. Moreover, 
it is easy to show that the existence of an infinite set cannot be proved from 
the extensionality and comprehension schemas." To see this, consider the 
“model” in which each variable of type n ranges over the sets of rank less 
than or equal to n +, 1. (There is nothing wrong about assigning overlapping 
ranges to variables of different types.) 

We shall assume an axiom that guarantees the existence of an infinite 
set. As a preliminary, we shall adopt the usual definition {{x"}, {x", y"}} of the 
ordered pair: (x", y"), where {x", y"} stands for {u"|u" = x" v u" = y"}. Notice 
that (x”, y") is of type n + 2. Hence, a binary relation on a set A, being a set of 
ordered pairs of elements of A, will have type 2 greater than the type of A. In 
particular, a binary relation on the universe V! = {x°|x° = x°} of all objects of 
type 0 will be a set of type 3. 


4.6.3.3 ST3 (Axiom of Infinity) 


(Ax*)([(au°)(4v° )((u°, 0°) € x°)] A 
(vu°)(Vo° (Vw )((u?, uw) € x8 A[iu’, 0°) ex? Av’, w®) Ex? > 


uw) ex s|a[iu’,v°) ex? = (az°)((0", 2°) ex?)])) 


This asserts that there is a nonempty irreflexive, transitive binary relation x3 
on V! such that every member of the range of x3 also belongs to the domain 
of x°. Since no such relation exists on a finite set, V! must be infinite. 

The system based on ST1-STS3 is called the simple theory of types and is 
denoted ST. Because of its somewhat complex notation and the repetition 
of concepts at all (or, in some cases, almost all) type levels, ST is not gen- 
erally used as a foundation of mathematics and is not the subject of much 


* This fact seemed to undermine Russell’s doctrine of logicism, according to which all of math- 
ematics could be reduced to basic axioms that were of an essentially logical character. An 
axiom of infinity could not be thought of as a logical truth. 
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contemporary research. Suggestions by Turing (1948) to make type theory 
more usable have been largely ignored. 

With ST we can associate a first-order theory ST*. The nonlogical constants 
of ST* are € and monadic predicates T,, for each natural number n. We then 
translate any wf .4 of ST into ST* by replacing subformulas (Vx")7 (x") by (Vx) 
(T(x) > @(x)) and, finally, if y", vee yt are the free variables of .4, prefixing 
to the result T;,(yi)A...AT;(yx) => and changing each y’ into y;. In a rigor- 
ous presentation, we would have to specify clearly that the replacements are 
made by proceeding from smaller to larger subformulas and that the vari- 
ables x, y;, ..., y, are new variables. The axioms of ST* are the translations of 
the axioms of ST. Any theorem of ST translates into a theorem of ST*. 


Exercise 


4.96 Exhibit a model of ST* within NBG. 


By virtue of Exercise 4.96, NBG (or ZF) is stronger than ST: (1) any theorem 
of ST can be translated into a corresponding theorem of NBG, and (2) if NBG 
is consistent, so is ST* 

To provide a type theory that is easier to work with, one can add axi- 
oms that impose additional structure on the set V! of objects of type 0. For 
example, Peano’s axioms for the natural numbers were adopted at level 0 in 
Gédel’s system P, for which he originally proved his famous incompleteness 
theorem (see Gédel, 1931). 

In Principia Mathematica (1910-1913), the three-volume work by Alfred 
North Whitehead and Bertrand Russell, there is a theory of types that is 
further complicated by an additional hierarchy of orders. This hierarchy was 
introduced so that the comprehension scheme could be suitably restricted 
in order not to generate an impredicatively defined set, that is, a set A defined 
by a formula in which some quantified variable ranges over a set that turns 
out to contain the set A itself. Along with the mathematician Henri Poincaré, 
Whitehead and Russell believed impredicatively defined sets to be the root 
of all evil. However, such concepts are required in analysis (¢.g., in the proof 
that any nonempty set of real numbers that is bounded above has a least 
upper bound). Principia Mathematica had to add the so-called axiom of reduc- 
ibility to overcome the order restrictions imposed on the comprehension 
scheme. The Whitehead—Russell system without the axiom of reducibility is 
called ramified type theory; it is mathematically weak but is of interest to those 
who wish an extreme constructivist approach to mathematics. The axiom 
of reducibility vitiates the effect of the order hierarchy; therefore, it is much 
simpler to drop the notion of order and the axiom of reducibility. The result 
is the simple theory of types ST, which we have described above. 


* A stronger result was proved by John Kemeny (1949) by means of a truth definition within Z: 
if Z is consistent, so is ST. 
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In ST, the types are natural numbers. For a smoother presentation, some 
logicians allow a larger set of types, including types for relations and/or 
functions defined on objects taken from previously defined types. Such a 
system may be found in Church (1940). 

Principia Mathematica must be read critically; for example, it often overlooks 
the distinction between a formal theory and its metalanguage. The idea of 
a simple theory of types goes back to Ramsey (1925) and, independently, 
to Chwistek (1924-1925). Discussions of type theory are found in Andrews 
(1986), Hatcher (1982) and Quine (1963). 


4.6.4 Quine’s Theories NF and ML 


Quine (1937) invented a type theory that was designed to do away with some 
of the unpleasant aspects of type theory while keeping the essential idea of the 
comprehension axiom ST2. Quine’s theory NF (New Foundations) uses only one 
kind of variable x, y, z, ... and one binary predicate letter €. Equality is defined 
as in type theory: x = y stands for (Vz)(x € z © y € 2). The first axiom is familiar: 


4.6.4.1 NF1 (Extensionality) 
(Vz)(zex@zey)>x=y 


In order to formulate the comprehension axiom, we introduce the notion 
of stratification. A wf 7 is said to be stratified if one can assign integers to 
the variables of .7 so that: (1) all occurrences of the same free variable are 
assigned the same integer, (2) all bound occurrences of a variable that are 
bound by the same quantifier must be assigned the same integer, and (3) for 
every subformula x € y of .% the integer assigned to y is 1 greater than the 
integer assigned to x. 


Examples 


1. yx EyAy €2Z Vu EX is stratified by virtue of the assignment 
indicated below by superscripts: 


(Ay’*\xleyayez)vurex' 
2. (Ay)(x € y)) A Gyy € x) is stratified as follows: 


((Ay?)(x" € y”)) a (ay")(y® € x") 


Notice that the ys in the second conjunct do not have to have the 
same integers assigned to them as the ys in the first conjunct. 
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3.x €yV y €X is not stratified. If x is assigned an integer n, then the 
first y must be assigned n + 1 and the second y must be assigned 
n — 1, contradicting requirement (1). 


4.6.4.2 NF2 (Comprehension) 


For any stratified wf .7(x), 


(4y)(Vx)(x EYOS (x)) 


is an axiom. (Here, y is assumed to be the first variable not free in .4(x).) 
Although NF2 is an axiom scheme, it turns out that NF is finitely axiomat- 
izable (Hailperin, 1944). 


Exercise 


4.97 Prove that equality could have been defined as follows: x = y for (Vz)(x 
€z=> y € 2). (More precisely, in the presence of NF2, this definition is 
equivalent to the original one.) 


The theory of natural numbers, ordinal numbers and cardinal numbers 
is developed in much the same way as in type theory, except that there is 
no longer a multiplicity of similar concepts. There is a unique empty set A = 
{x|x 4 x} and a unique universal set V = {x|x = x}. We can easily prove V € V, 
which immediately distinguishes NF from type theory (and from NBG, MK 
and ZF). 

The usual argument for Russell’s paradox does not hold in NF, since 
x € x is not stratified. Almost all of standard set theory and mathematics 
is derivable in NF; this is done in full detail in Rosser (1953). However, NF 
has some very strange properties. First of all, the usual proof of Cantor’s 
theorem, A < .7(A), does not go through in NF; at a key step in the proof, 
a set that is needed is not available because its defining condition is not 
stratified. The apparent unavailability of Cantor’s theorem has the desir- 
able effect of undermining the usual proof of Cantor’s paradox. If we could 
prove A < (A), then, since 7(V) = V, we could obtain a contradiction 
from V < .7(V). In NE the standard proof of Cantor’s theorem does yield 
USC(A) « .7(A), where USC(A) stands for {x|(Au)\(u € A A x = {u})}. If we let 
A = V, we conclude that USC(V) < V. Thus, V has the peculiar property that 
it is not equinumerous with the set of all unit sets of its elements. In NBG, 
the function f, defined by f(u) = {u} for all u in A, establishes a one-one cor- 
respondence between A and USC(A) for any set A. However, the defining 
condition for f is not stratified, so that f may not exist in NF. If f does exist, 
A is said to be strongly Cantorian. 
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Other surprising properties of NF are the following. 


1. The axiom of choice is disprovable in NF (Specker, 1953). 


2. Any model for NF must be nonstandard in the sense that a well- 
ordering of the finite cardinals or of the ordinals of the model is not 
possible in the metalanguage (Rosser and Wang, 1950). 


3. The axiom of infinity is provable in NF (Specker, 1953). 


Although property 3 would ordinarily be thought of as a great advantage, 
the fact of the provability of an axiom of infinity appeared to many logicians 
to be too strong a result. If that can be proved, then probably anything can be 
proved, that is, NF is likely to be inconsistent. In addition, the disprovability 
of the axiom of choice seems to make NF a poor choice for practicing math- 
ematicians. However, if we restrict attention to so-called Cantorian sets, sets 
A for which A and USC(A) are equinumerous, then it might be consistent to 
assume the axiom of choice for Cantorian sets and to do mathematics within 
the universe of Cantorian sets. 

NF has another attractive feature. A substantial part of category theory (see 
MacLane, 1971) can be developed in a straightforward way in NE, whereas 
this is not possible in ZF, NBG, or MK. Since category theory has become an 
important branch of mathematics, this is a distinct advantage for NF. 

If the system obtained from NF by assuming the existence of an inac- 
cessible ordinal is consistent, then ZF is consistent (see Collins, 1955; Orey, 
1956a). If we add to NF the assumption of the existence of an infinite strongly 
Cantorian set, then Zermelo’s set theory Z is consistent (see Rosser, 1954). The 
question of whether the consistency of ZF implies the consistency of NF is 
still open (as is the question of the reverse implication). 

Let ST- be the simple theory of types ST without the axiom of infinity. 
Given any closed wf .7of ST, let .4 denote the result of adding 1 to the types 
of all variables in 7. Let SP denote the theory obtained from ST by adding as 
axioms the wfs .7@ .# for all closed wfs .%. Specker (1958, 1962) proved that 
NF is consistent if and only if SP is consistent. 

Let NFU denote the theory obtained from NF by restricting the extension- 
ality axiom to nonempty sets: 


NFI* (Guj(uex)a(vz)\(zexezey)>x=y 


Jensen (1968-1969) proved that NFU is consistent if and only if ST- is con- 
sistent, and the equiconsistency continues to hold when both theories are 
supplemented by the axiom of infinity or by axioms of infinity and choice. 
Discussions of NF may be found in Hatcher (1982) and Quine (1963). Forster 
(1983) gives a survey of more recent results. 
Quine also proposed a system ML that is formally related to NF in much 
the same way that MK is related to ZF. The variables are capital italic letters 
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X, Y, Z, ...; these variables are called class variables. We define M(X), X is a 
set,* by (AY\(X € Y), and we introduce lower-case italic letters x, y, z, ... as 
variables restricted to sets. Equality is defined as in NBG: X = Y for (VZ)(Z € 
X @Z€Y). Then we introduce an axiom of equality: 

ML1: X=YaxeZ>YeZ 


There is an unrestricted comprehension axiom scheme: 


ML2: (4Y)(Vx)(xeY & .4(x)) 


where .4(x) is any wf of ML. Finally, we wish to introduce an axiom that has 
the same effect as the comprehension axiom scheme NF2: 


ML3: (Vy1)...(WYn)(AZ)\(Vx)(x € ZS 7(X)) 


where .4(x) is any stratified wf whose free variables are x, y,, ..., y,(n = 0) and 
whose quantifiers are set quantifiers. 

All theorems of NF are provable in ML. Hence, if ML is consistent, so is NF. 
The converse has been proved by Wang (1950). In fact, any closed wf of NF 
provable in ML is already provable in NF. 

ML has the same advantages over NF that MK and NBG have over ZF: a 
greater ease and power of expression. Moreover, the natural numbers of ML 
behave much better than those of NF; the principle of mathematical induc- 
tion can be proved in full generality in ML. 

The prime source for ML is Quine (1951). Consult also Quine (1963) and 
Fraenkel et al. (1973). 


4.6.5 Set Theory with Urelements 


The theories NBG, MK, ZF, NF, and ML do not allow for objects that are 
not sets or classes. This is all well and good for mathematicians, since only 
sets or classes seem to be needed for dealing with mathematical concepts 
and problems. However, if set theory is to be a part of a more inclusive 
theory having to do with the natural or social sciences, we must permit 
reference to things like electrons, molecules, people, companies, etc., and to 
sets and classes that contain such things. Things that are not sets or classes 
are sometimes called urelements.t We shall sketch a theory UR similar to 


* Quine uses the word “element” instead of “set.” 

* Quine’s earlier version of ML, published in 1940, was proved inconsistent by Rosser (1942). 
The present version is due to Wang (1950). 

+ “Ur” is a German prefix meaning primitive, original or earliest. The words “individual” and 
“atom” are sometimes used as synonyms for “urelement.” 
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NBG that allows for the existence of urelements.* Like NBG, UR will have a 
finite number of axioms. 

The variables of UR will be the lower-case Latin boldface letters x,, xz, ... . 
(As usual, let us use x, y, z, ... to refer to arbitrary variables.) In addition to 
the binary predicate letter A} there will be a monadic predicate letter Aj. We 
abbreviate A3(x,y) by x € y,A3(x,y) by x ¢ y, and Ai(x) by Cls (x). (Read 
“Cls(x)” as “x is a class.”) To bring our notation into line with that of NBG, 
we shall use capital Latin letters as restricted variables for classes. Thus, 
(VX).a(X) stands for (Vx)(Cls(x) > .A(x)), and (AX).A(X) stands for (Ax)(Cls(x) A 

A\(x)). Let M(x) stand for Cls(x) A (Ay (x € y), and read “M(x)” as “x is a set.” 
As in NBG, use lower-case Latin letters as restricted variables for sets. Thus, 
(Vx). A(x) stands for (Vx)(M(x) > .200), and (4x). 7(x) stands for (Ax)(M(x) A .7(x)). 
Let Pr(x) stand for Cls(x) A =M(x), and read “Pr(x)” as “x is a proper class.” 
Introduce Ur(x) as an abbreviation for ~Cls(x), and read “Ur(x)” as “x is an 
urelement.” Thus, the domain of any model for UR will be divided into two 
disjoint parts consisting of the classes and the urelements, and the classes 
are divided into sets and proper classes. Let El(x) stand for M(x) v Ur(x), and 
read “El(x)” as “x is an element.” In our intended interpretation, sets and ure- 
lements are the objects that are elements (i.e., members) of classes. 


Exercise 
4.98 Prove: Fyp (Vx)(El(x) = =Pr(x)). 


We shall define equality in a different way for classes and urelements. 
Definition 
x = y is an abbreviation for: 


[Cls(x) A Cls(y) A (Vz)(z Ex & ze y)] v[Ur(x) A Ur(y) A (Vz)(xEez@ yez)] 


Exercise 
4.99 Prove: Fyp(Vx)(x = x). 
Axiom UR1 
(Vx)(Ur(x) => (Vy)(y € x)] 


Thus, urelements have no members. 


* Zermelo’s 1908 axiomatization permitted urelements. Fraenkel was among the first to draw 
attention to the fact that urelements are not necessary for mathematical purposes (see 
Fraenkel, 1928, pp. 355f). Von Neumann’s (1925, 1928) axiom systems excluded urelements. 
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Exercise 


4.100 Prove: Fyp(Vx)(Vy)(x € y > Cls(y) A El(x)). 


Axiom UR2 
(VX)\(VY)(VZ)(X =YaX eZ>YV eZ) 


Exercise 


4.101 Show: 

a. Fur (Vx)(Vy)(x = y > (Wz)\(z Ex &2z Ey)) 

b. Fyp (Vx)(Vy\(x = y > (Vz)\(x Ez S y € 2) 

c. Fur (WxXVy)x = y > [Cls(x) = Cls(y)] A [Ur(x) = Urly)] A M(x) = M(y))) 

d. Fup (Vx)(Vy)[x = y > (Ax, x) > AX, y))], where A(x, y) arises from 
4(x, x) by replacing some, but not necessarily all, free occurrences 
of x by y, with the proviso that y is free for x in .7(x, x). 

e. UR is a first-order theory with equality (with respect to the given 
definition of equality). 


Axiom UR3 (Null Set) 


(Ax)(Vy)ly € x) 


This tell us that there is a set that has no members. Of course, all urelements 
also have no elements. 


Exercise 


4.102 Show: Fyp (4\x)(Vy)(y ¢ x). On the basis of this exercise we can introduce 
anew individual constant @ satisfying the condition M@) A (Vy)(y ¢ @). 


Axiom UR&4 (Pairing) 


(Vx)(Vy)(El(x) A El(y) > (dz)(Vu)(u € z @ [u=xvu=y)) 


Exercise 


4.103 Prove: yp (Vx)(Vy)(4,z)([EL@) A El(y) A (Vu)(ue z @ [u=xVvu=y))Vv 
[(-El(x) V sEl(y)) A z = @]) On the basis of this exercise we can intro- 
duce the unordered pair notation {x, y}. When x and y are elements, 
{x, y} is the set that has x and y as its only members; when x or y is 
a proper class, {x, y} is arbitrarily chosen to be the empty set @. As 
usual, the singleton notation {x} stands for {x, x}. 
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Definition (Ordered Pair) 


Let (x, y) stand for {{x}, {x, y}}. As in the proof of Proposition 4.3, one can 
show that, for any elements x, y, u, v, (x, y) = (u, v) & [k= uA y= v]. Ordered 
n-tuples can be defined as in NBG. 

The class existence axioms B1-B7 of NBG have to be altered slightly by 
sometimes replacing universal quantification with respect to sets by univer- 
sal quantification with respect to elements. 


Axioms of Class Existence 


(URS) (4X)(Vu)(Vv\(El(u) A El(v) > [(u, v) € X @ ue v)) 
(UR6) (VX)(VY)\(4Z)(Vu)(u E ZS ue XAuEY) 
(UR7) (VX)(4Z)(Vu)(El(u) > [u € Z @ u € X)) 


) 
)(Z) 

(URS) (VX)(AZ) 

(UR9) (VX)(4Z)(Wu)(Vv\(El(u) A El(v) = ((u, v) € Z @ u€ X)) 

(UR10) — (VX)(4Z)(Wu)(Vv)(Vw)(El(u) A El(v) A El(w) > [(u, v, w) © Ze 


(UR11) — (VX)(4Z)(Vu)(Vv)(Vw)(El(u) A El(v) A El(w) > [(u, v, w) € Z @ 
(u, w, v) € XJ) 


As in NBG, we can prove the existence of the intersection, complement and 
union of any classes, and the existence of the class V of all elements. But in 
UR we also need an axiom to ensure the existence of the class V,, of all sets, 
or, equivalently, of the class V,,, of all urelements. 


Axiom UR12 


— 
| 


4X)(Vu)(u e X & Ur(u)) 


This yields the existence of V,,, and implies the existence of Vy, that is, (AX) 
(Vu)(u € X & M(u)). The class V;, of all elements is then the union V,,, U Vy. 
Note that this axiom also yields (4X)(Vu)(El(u) > [u € X © Cls(u)]), since Vyy 
can be taken as the required class X. 

As in NBG, we can prove a general class existence theorem. 


Exercise 


4.104 Let p(x, ..., X,, Vy, --» Y,,) be a formula in which quantification takes 
place only with respect to elements, that is, any subformula (Vu).7 has 
the form (Vu)(El(u) > 7). Then 


Fur (4Z)(Vx1) tars (VxXn)(EL(x1) A ae AEI(X;) > 
Ke pee XV EZ = (x1, lp Kei Vly 9 Ym) |: 
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The sum set, power set, replacement and infinity axioms can be translated 
into UR. 


Axiom UR13 


(Vx)(y)(Vu)(u € y & (Av)(uevavex)) 


Axiom UR14 


(Vx)(y)(Vu)(uey Sucx) 


where u C x stands for M(u) A M(x) A (Vv)(v € u > v Ex). 
Axiom UR15 


(VY)(Vx)(Un(Y) => (Ay)(Vu)[u € y & (v)((v,u) € Yavex)]) 


where Un(z) stands for 
(Vx1)(VX2)(V x3 )[E](x1) A El(x2) rx El(x3) > ((x1,X2) EZA (x1,X3) EZDX= x3)] 


Axiom UR16 


(Ax\(OPexa(Vu)(uex> uv {uj ex)) 


From this point on, the standard development of set theory including the 
theory of ordinal numbers, can be imitated in UR. 


Proposition 4.49 


NBG is a subtheory of UR. 


Proof 


It is easy to verify that every axiom of NBG is provable in UR, provided that 
we take the variables of NBG as restricted variables for “classes” in UR. The 
restricted variables for sets in NBG become restricted variables for “sets” 
in UR* 


* In fact, a formula (Vx).A(x) in NBG is an abbreviation in NBG for (VX)(AY)(X € Y) > .A(X)). The 
latter formula is equivalent in UR to (Vx)(M(x) > .4(x)), which is abbreviated as (Vx).7(x) in UR. 
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Proposition 4.50 
UR is consistent if and only if NBG is consistent. 


Proof 


By Proposition 4.49, if UR is consistent, NBG is consistent. For the converse, 
note that any model of NBG yields a model of UR in which there are no ure- 
lements. In fact, if we replace “Cls(x)” by the NBG formula “x = x,” then the 
axioms of UR become theorems of NBG. Hence, a proof of a contradiction in 
UR would produce a proof of a contradiction in NBG. 

The axiom of regularity (Reg) takes the following form in UR. 


(Regur) (VX)(X #9 > (du)(ue X a-(Av)(v € XAVEU))) 


It is clear that an analogue of Proposition 4.49 holds: UR + (Regyr) is an 
extension of NBG + (Reg). Likewise, the argument of Proposition 4.50 shows 
the equiconsistency of NBG + (Reg) and UR + (Regyp). 

Since definition by transfinite induction (Proposition 4.14(b)) holds in UR, 
the cumulative hierarchy can be defined 


W'S =O 
Y'(a')= (Va) 
Lim(A) > Y/A= U ¥'B 


B<oA 


and the union H = |) (¥’On) is the class of “pure” sets in UR and forms a 
model of NBG + (Reg). In NBG, by Proposition 4.45, (Reg) is equivalent to 
V =H, where V is the class of all sets. 

If the class V,,, of urelements is a set, then we can define the following by 
transfinite induction: 


The union H,,, = U (&”On) is a model of UR + (Regyg), and (Regyp) holds in UR 
if and only if H,,, is the class V;, of all elements. 

If the class V,, of urelements is a proper class, it is possible to obtain an 
analogue of H,, in the following way. For any set x whose members are 
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urelements and any ordinal y, we can define a function ©} by transfinite 
induction up to y: 


EUO=x 
Ey'(a’)= 9(Exa.) if a’<,y 


Lim(1)> 2l%= UEUB if A<oy 


B<or 


Let H.. be the class of all elements v such that, for some x and y vis in 
the range of &{. Then H... determines a model of UR + (Regy,), and, in UR, 
(Regy,) holds if and only if H;;, is the class V,, of all elements. 

The equiconsistency of NBG and UR can be strengthened to show the fol- 
lowing result. 


Proposition 4.51 
If NBG is consistent, then so is the theory UR + (Reg,,) + “V,,,is denumerable.” 


Proof 


Within NBG one can define a model with domain o that is a model of NBG 
without the axiom of infinity. The idea is due to Ackermann (1937). For any 
nand m in o, define m én to mean that 2” occurs as a term in the expansion 
of n as a sum of different powers of 2.* If we take “A-sets” to be members 
of » and “proper A-classes” to be infinite subsets of , it is easy to verify 
all axioms of NBG + (Reg) except the axiom of infinity. (Gee Bernays 1954, 
pp. 81-82 for a sketch of the argument.) Then we change the “membership” 
relation on w by defining m €, n to mean that 2” n. Now we define a “set” 
to be either 0 or a member n of for which there is some m in w such that 
m €,n. We take the “urelements” to be the members of w that are not “sets.” 
For example, 8 is an “urelement,” since 8 = 23 and 3 is not a power of 2. 
Other small “urelements” are 1, 9, 32, 33, and 40. In general, the “urele- 
ments” are sums of one or more distinct powers 2, where k is not a power 
of 2. The “proper classes” are to be the infinite subsets of w. Essentially the 
same argument as for Ackermann’s model shows that this yields a model 

“of all axioms of UR + (Regyg) except the axiom of infinity. Now we want 
to extend /to a model of UR. First, let r stand for the set of all finite subsets 


* This is equivalent to the statement that the greatest integer k such that k - 2" < nis odd. 
t For distinct natural numbers ny, ..., n,, the role of the finite set {n,, ..., n,} is played by the 
natural number 2”, + --- + 2”. 
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of w that are not members of w, and then define by transfinite induction the 
following function ®. 


O'S =a 
O'(a')=.7(O'a)=r 
Lim(2)> A= U OB 


Let H, = U (©”On). Note that H, contains no members of r. Let us define a 
membership relation €* on H,. For any members x and y of H;, define x €* y 
to mean that either x and y are in w and x €,y, or y ¢ w and x € y. The “urele- 
ments” will be those members of w that are the “urelements” of .~ The “sets” 
will be the ordinary sets of H, that are not “urelements,” and the “proper 
classes” will be the proper classes of NBG that are subclasses of Hy. It now 
requires a long careful argument to show that we have a model of UR + 
(Regy,) in which the class of urelements is a denumerable set. 

A uniform method for constructing a model of UR + (Regy,) in which the 
class of urelements is a set of arbitrary size may be found in Brunner (1990, 
p. 65).* If AC holds in the underlying theory, it holds in the model as well. 

The most important application of axiomatic set theories with urelements 
used to be the construction of independence proofs. The first independence 
proof for the axiom of choice, given by Fraenkel (1922b), depended essen- 
tially on the existence of a denumerable set of urelements. More precise 
formulations and further developments may be found in Lindenbaum and 
Mostowski (1938) and Mostowski (1939). Translations of these proofs into 
set theories without urelements were found by Shoenfield (1955), Mendelson 
(1956b) and Specker (1957), but only at the expense of weakening the axiom of 
regularity. This shortcoming was overcome by the forcing method of Cohen 
(1966), which applies to theories with (Reg) and without urelements. 


* Brunner attributes the idea behind the construction to J. Truss. 
+ For more information about these methods, see Levy (1965), Pincus (1972), Howard (1973), 
and Brunner (1990). 


e) 


Computability 


5.1 Algorithms: Turing Machines 


An algorithm is a computational method for solving each and every problem 
from a large class of problems. The computation has to be precisely specified 
so that it requires no ingenuity for its performance. The familiar technique 
for adding integers is an algorithm, as are the techniques for computing the 
other arithmetic operations of subtraction, multiplication and division. The 
truth table procedure to determine whether a statement form is a tautology 
is an algorithm within logic itself. 

It is often easy to see that a specified procedure yields a desired algorithm. 
In recent years, however, many classes of problems have been proved not to 
have an algorithmic solution. Examples are: 


1. Is a given wf of quantification theory logically valid? 


2.Is a given wf of formal number theory S true (in the standard 
interpretation)? 


3. Is a given wf of S provable in S? 


4. Does a given polynomial f(x, ..., x,) with integral coefficients have 
integral roots (Hilbert’s 10th problem)? 


In order to prove rigorously that there does not exist an algorithm for answer- 
ing such questions, it is necessary to supply a precise definition of the notion 
of algorithm. 

Various proposals for such a definition were independently offered in 1936 
by Church (1936b), Turing (1936-1937), and Post (1936). All of these defini- 
tions, as well as others proposed later, have been shown to be equivalent. 
Moreover, it is intuitively clear that every procedure given by these defini- 
tions is an algorithm. On the other hand, every known algorithm falls under 
these definitions. Our exposition will use Turing’s ideas. 

First of all, the objects with which an algorithm deals may be assumed to 
be the symbols of a finite alphabet A = {ao, a, ..., a,,}. Nonsymbolic objects can 
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ay as ay 


FIGURE 5.1 


be represented by symbols, and languages actually used for computation 
require only finitely many symbols.” 

A finite sequence of symbols of a language A is called a word of A. It is con- 
venient to admit an empty word A consisting of no symbols at all. If P and Q 
are words, then PQ denotes the word obtained by writing Q to the right of P. 
For any positive integer k, P* shall stand for the word made up of k consecu- 
tive occurrences of P. 

The work space of an algorithm often consists of a piece of paper or a 
blackboard. However, we shall make the simplifying assumption that all 
calculations take place on a tape that is divided into squares (see Figure 5.1). 
The tape is potentially infinite in both directions in the sense that, although 
at any moment it is finite, more squares always can be added to the right- 
and left-hand ends of the tape. Each square contains at most one symbol of 
the alphabet A. At any one time, only a finite number of squares contain 
symbols, while the rest are blank. The symbol ay will be reserved for the 
content of a blank square. (In ordinary language, a space is sometimes used 
for the same purpose.) Thus, the condition of the tape at a given moment 
can be represented by a word of A; the tape in Figure 5.1 is a,agasa,. Our 
use of a one-dimensional tape does not limit the algorithms that can be 
handled; the information in a two-dimensional array can be encoded as a 
finite sequence+ 

Our computing device, which we shall refer to as a Turing machine, works 
in the following way. The machine operates at discrete moments of time, not 
continuously. It has a reading head which, at any moment, will be scanning 
one square of the tape. (Observation of a larger domain could be reduced to 
consecutive observations of individual squares.) The device then reacts in 
any of four different ways: 


1. It prints a symbol in the square, erasing the previous symbol. 
2. It moves to the next square to the right. 

3. It moves to the next square to the left. 

4. It stops. 


* If a language has a denumerable alphabet {ag, a, ...}, then we can replace it by the alphabet 
{b, *}. Each symbol a,, of the old alphabet can be replaced by the expression b* ---* , consisting 
of b followed by n occurrences of*. 

+ This follows from the fact that there is an effective one-one correspondence between the set 
of pairs of natural numbers and the set of natural numbers. For the details, see pages 184-185. 
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What the machine does depends not only on the observed symbol but also on 
the internal state of the machine at that moment (which, in turn, depends on 
the previous steps of the computation and on the structure of the machine). 
We shall make the plausible assumption that a machine has only a finite 
number of internal states {qy, qy --., G,,}- The machine will always begin its 
operation in the initial state qo. 

A step in a computation corresponds to a quadruple of one of the following 
three forms: (1) qja;a;q,; (2) q;a;Rq,; (3) qjajLq,. In each case, q; is the present 
internal state, a; is the symbol being observed, and gq, is the internal state 
after the step. In form (1), the machine erases a; and prints a,. In form (2), the 
reading head of the machine moves one square to the right, and, in form (3), 
it moves one square to the left. We shall indicate later how the machine is 
told to stop. 

Now we can give a precise definition. A Turing machine with an alphabet A 
of tape symbols {ap, ay, ..., a,} and with internal states {qq qy ---, Gn} is a finite set 

7 of quadruples of the forms (1) qja,;a,q,, (2) q;a;Rq,, and (3) qjajLq, such that 
no two quadruples of 7 have the same first two symbols. 

Thus, for fixed q,a;,no two quadruples of types (1), (2), and (3) are in .~ 
This condition ensures that there is never a situation in which the machine is 
instructed to perform two contradictory operations. 

The Turing machine 7 operates in accordance with its list of quadruples. 
This can be made precise in the following manner. 

By a tape description of 7 we mean a word such that: (1) all symbols in the 
word but one are tape symbols; (2) the only symbol that is not a tape symbol 
is an internal state q;; and (3) q; is not the last symbol of the word. 

A tape description describes the condition of the machine and the tape at a 
given moment. When read from left to right, the tape symbols in the descrip- 
tion represent the symbols on the tape at that moment, and the tape symbol 
that occurs immediately to the right of q; in the tape description represents 
the symbol being scanned by the reading head at that moment. If the internal 
state q; is the initial state qo, then the tape description is called an initial tape 
description. 


Example 

The tape description a,a9q,a9a,a, indicates that the machine is in the internal 
state q,, the tape is as shown in Figure 5.2, and the reading head is scanning 
the square indicated by the arrow. 


FIGURE 5.2 
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We say that 7 moves one tape description « into another one f (abbreviated 
a. —» B) if and only if one of the following is true. 


1. wis of the form Pqa,Q, B is of the form Pq,a,Q, and qja,a,q, is one of 
the quadruples of .7-* 


2. a is of the form Pa,q,a;Q, B is of the form Pq,a,a,Q, and q,a;Lq, is one 
of the quadruples of .7- 


3. wis of the form qa,Q, B is of the form q,aja;Q, and q,ajLq, is one of the 
quadruples of 


4. wis of the form Pq,a;a,Q, B is of the form Pajq,a,Q, and q,ajRq, is one 
of the quadruples of .7- 


5. «is of the form Pq,a;, B is of the form Pa,q,ay, and q,a;Rq, is one of the 
quadruples of 7 


According to our intuitive picture, “7 moves a into 6” means that, if the con- 
dition at a time t of the Turing machine and tape is described by a, then the 
condition at time t + 1 is described by f. Notice that, by clause 3, whenever 
the machine reaches the left-hand end of the tape and is ordered to move 
left, a blank square is attached to the tape on the left; similarly, by clause 5, 
a blank square is added on the right when the machine reaches the right- 
hand end and has to move right. 

We say that 7 stops at tape description a if and only if there is no tape 
description B such that «—»B. This happens when qa; occurs in a but q,a; is 
not the beginning of any quadruple of 7 

A computation of .7 is a finite sequence of tape descriptions ap, ..., a; (k = 0) 
such that the following conditions hold. 


1. a is an initial tape description, that is, the internal state occurring in 
GIS qo. 
2: Oj Aj41 for 0 < i<k 


3. stops at O,. 


This computation is said to begin at a) and end at a,. If there is a computation 
beginning at %, we say that 7 is applicable to aX. 
The algorithm Alg . determined by .7 is defined as follows: 


For any words P and Q of the alphabet A of 7, Alg _(P) = Q if and only if 
there is a computation of .7 that begins with the tape description q,P and 
ends with a tape description of the form R,q,R,, where Q = R, Ro. 


This means that, when 7 begins at the left-hand end of P and there is nothing 
else on the tape, 7 eventually stops with Q as the entire content of the tape. 


* Here and below, P and Q are arbitrary (possibly empty) words of the alphabet of _~ 


Computability 315 


Notice that Alg - need not be defined for certain words P. An algorithm Alg , 
determined by a Turing machine 7 is said to be a Turing algorithm. 


Example 
In any computation of the Turing machine 7 given by 


qoaoRqo, 0414091, G0a240q1, ---, FoanaoGi 


7 locates the first nonblank symbol (if any) at or to the right of the square 
scanned at the beginning of the computation, erases that symbol, and then 
stops. If there are only blank squares at or to the right of the initial square, 

7 keeps on moving right forever. 

Let us now consider computations of number-theoretic functions. For con- 
venience, we sometimes will write | instead of a, and B instead of ay. (Think 
of B as standing for “blank.”) For any natural number k, its tape representation 
k will stand for the word |‘, that is, the word consisting of k + 1 occurrences 
of |. Thus, 0 =|, 1=||,2=|||, and so on. The reason why we represent k by 
k +1 occurrences of | instead of k occurrences is that we wish 0 to be a non- 
empty word, so that we will be aware of its presence. The tape representation 
(kiko, ..., ky) of an n-tuple of natural numbers (k,, k,, ..., k,) is defined to be 
the word k,Bk,B---Bk,. For example, (3,1,0,5) is ||||B||B|B]|||||- 

A Turing machine 7 will be thought of as computing the following partial 
function f,, of one variable* - 

f (=m if and only if the following condition holds: Alg -(k) is defined 
and Alg -(k) =E,mE,, where E, and E, are certain (possibly empty) words 
consisting of only Bs (blanks). 

The function f,, is said to be Turing-computable. Thus, a one-place partial 
function fis Turing-computable if and only if there is a Turing machine such 
that f=... 

For each n > 1, a Turing machine 7also computes a partial function f ,,, of 
n variables. For any natural numbers k,, ..., k,; 

Ff ky... k,) = mif and only if the following condition holds: 

Alg -((ki, ko, ..., kn)) is defined and Alg ,((ky, kx, ..., k,)) =E,mE2, where 
E, and E, are certain (possibly empty) words consisting of only Bs (blanks). 

The partial function f, ,, is said to be Turing-computable. Thus, an n-place 
partial function f is Turing-computable if and only if there is a Turing 
machine 7*such that f=f,,. 

Notice that, at the end of a computation of a value of a Turing-computable 
function, only the value appears on the tape, aside from blank squares at 
either or both ends, and the location of the reading head does not matter. 
Also observe that, whenever the function is not defined, either the Turing 


* Remember that a partial function may fail to be defined for some values of its argument. 
Thus, a total function is considered to be a special case of a partial function. 
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machine will never stop or, if it does stop, the resulting tape is not of the 
appropriate form Eym Ep. 


Examples 
1. Consider the Turing machine 7 with alphabet {B,|}, defined 
by qo|Lq, q:B|q. “computes the successor function N(x), 
since qok q,Bk q.k+1, and 7stops at qok+1. Hence N(x) is 
Turing-computable. 
2. The Turing machine 7 defined by 


qo | Bq, qiBRqo, qoB | q2 


computes the zero function Z(x). Given k on the tape, 7 moves right, 
erasing all |s until it reaches a blank, which it changes to a |. So, 0 is 
the final result. Thus, Z(x) is Turing-computable. 


3. The addition function is computed by the Turing machine ./defined 
by the following seven quadruples: 


go | Bgo, qoBRq1, qi | Rau, 1B | qQz2, q2 | Rq2, q2BL | qs, qs | Bqs 


In fact, for any natural numbers m and n, 


qo(m,n) = qo a B ka — qoB \" B kaa > Baqi ki B [et 
—»..-—»B lig qiB ks —+»B his qo | [ eS rate 


»B hi lngg qoB—> B hata q3 | BB (ee q3BB = Bm+nq3BB 


and 7 stops at Bm +nq3BB. 


Exercises 
5.1 Show that the function U3 such that U3(x1, x2) = x2 is Turing-computable. 


5.2 a. What function f(x, x, x3) is computed by the following Turing 
machine? 


ol qi, qu Bao, qoBRq1, qiBRq2, 
q2 Rqz, q2BRq3, q3 Bqa, qaBRqs 


b. What function f(x) is computed by the following Turing machine? 


go | Bqi, qiBRqz, q2B | q2 
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5.3 a. State in plain language the operation of the Turing machine, 
described in Example 3, for computing the addition function. 


b. Starting with the tape description qo|||B|||, write the sequence of 
tape descriptions that make up the computation by the addition 
machine of Example 3. 


5.4 What function f(x) is computed by the following Turing machine? 


go|Rq: qal/Rqs  q6Bl qo 
qilBq. qaBlqs — qiB|q7 
q2BRqs qs|Lqs = q7|Lq7 
q3|Rqs  qsBLqs q7BRqs 
qsBRqs = qe|Lqs qs | Bas 


5.5 Find a Turing machine that computes the function sg(x). (Recall that 
sg(0) = 0 and sg(x) = 1 for x > 0.) 
5.6? Find Turing machines that compute the following functions. 
a. x+y (Remember that x+y=x-yifx>y,andx~y=Oifx<y,) 
b. [x/2] (Recall that [x/2] is the greatest integer less than or equal to 
x/2. Thus, [x/2] = x/2 if x is even, and [x/2] = (x - 1)/2 if x is odd.) 
c. x-y, the product of x and y. 


5.7 If a function is Turing-computable, show that it is computable by infi- 
nitely many different Turing machines. 


5.2 Diagrams 


Many Turing machines that compute even relatively simple functions (like 
multiplication) require a large number of quadruples. It is difficult and 
tedious to construct such machines, and even more difficult to check that 
they do the desired job. We shall introduce a pictorial technique for con- 
structing Turing machines so that their operation is easier to comprehend. 
The basic ideas and notation are due to Hermes (1965). 


1. Let 4, ...,.% be any Turing machines with alphabet A = {ao, a,, ..., aj}. 

2. Select a finite set of points in a plane. These points will be called 
vertices. 

3. To each vertex attach the name of one of the machines 4, ...,. 7 Copies 
of the same machine may be assigned to more than one vertex. 
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FIGURE 5.3 


4. Connect some vertices to others by arrows. An arrow may go from 
a vertex to itself. Each arrow is labeled with one of the numbers 
0, 1, ..., k. No two arrows that emanate from the same vertex are 
allowed to have the same label. 


5. One vertex is enclosed in a circle and is called the initial vertex. 


The resulting graph is called a diagram. 


Example 
See Figure 5.3. 

We shall show that every diagram determines a Turing machine whose 
operation can be described in the following manner. Given a tape and a spe- 
cific square on the tape, the Turing machine of the initial vertex V of the dia- 
gram begins to operate, with its reading head scanning the specified square 
of the tape. If this machine finally stops and the square being scanned at the 
end of the computation contains the symbol a;, then we look for an arrow 
with label i emanating from the vertex V. If there is no such arrow, the com- 
putation stops. If there is such an arrow, it leads to a vertex to which another 
Turing machine has been assigned. Start that machine on the tape produced 
by the previous computation, at the square that was being scanned at the end 
of the computation. Repeat the same procedure that was just performed, and 
keep on doing this until the machine stops. The resulting tape is the output 
of the machine determined by the diagram. If the machine never stops, then 
it is not applicable to the initial tape description. 

The quadruples for this Turing machine can be specified in the following 
way. 


1. For each occurrence in the diagram of a machine 4, write its qua- 
druples, changing internal states so that no two machine occur- 
rences have an internal state in common. The initial vertex machine 
is not to be changed. This retains qy as the initial internal state of 
the machine assigned to the initial vertex. For every other machine 
occurrence, the original initial state q, has been changed to a new 
internal state. 
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2: 


If an occurrence of some % is connected by an arrow -4 to some 

4, then, for every (new) internal state q, of that occurrence of 7 
such that no (new) quadruple of .7 begins with q,a,, add the qua- 
druple q,a,,a,,q;, where q, is the (new) initial state for ;. (Step 2 
ensures that, whenever ; stops while scanning a,,.j will begin 
operating.) 


The following abbreviations are used in diagrams: 


1. 


4. 


. 0... k 
If one vertex is connected to another vertex by all arrows—>,-, ..., >, 
we replace the arrows by one unlabelled arrow. 


. If one vertex is connected to another by all arrows except -5, we 


Fu 


replace all the arrows by 5. 


. Let 4% stand for 7 %, let 444 stand for 4 —- .4— .%, and so on. 


Let 72be 77 let 7% be 777 and so forth. 


If no vertex is circled, then the leftmost vertex is to be initial. 


To construct diagrams, we need a few simple Turing machines as building 
blocks. 


1. 


r (tight machine). Let {ag, a,, ..., a;} be the alphabet. r consists of the 
quadruples q,a;Rq, for all a;. This machine, which has k + 1 qua- 
druples, moves one square to the right and then stops. 


. 1 (left machine). Let {ag, a,, ..., a,} be the alphabet. 1 consists of the 


quadruples qya;Lq, for all a;. This machine, which has k + 1 quadru- 
ples, moves one square to the left and then stops. 


. a; (constant machine) for the alphabet {ao, a;, ..., a,j. a; consists of 


the quadruples qoa,aq, for all a;. This machine replaces the initial 
scanned symbol by a; and then stops. In particular, a) erases the 
scanned symbol, and a, prints |. 


Examples of Turing Machines Defined by Diagrams 


1. 


P (Figure 5.4) finds the first blank to the right of the initially scanned 
square. In an alphabet {ao, a, ..., a;}, the quadruples for the machine 
P are qoa;Rq, for all a, and q,a,a;qo for all a; # ao. 


#0 


FIGURE 5.4 
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#0 


FIGURE 5.5 


2. A (Figure 5.5) finds the first blank to the left of the initially scanned 
square. 


Exercises 


5.8 Describe the operations of the Turing machines p (Figure 5.6) and i 
(Figure 5.7) and write the list of quadruples for each machine. 


5.9 Show that machine S in Figure 5.8 searches the tape for a nonblank 
square. If there are such squares, S finds one and stops. Otherwise, 


S never stops. 
7 _ 


FIGURE 5.6 


FIGURE 5.7 


0 0 
r—., al —>. aypagr > ajAag 
| #0 | #0 
a, 
Paya Nagp 


FIGURE 5.8 
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#0 


FIGURE 5.9 


To describe some aspects of the operation of a Turing machine on part of a 
tape, we introduce the following notation: 


~ arbitrary symbol 

B ...B_ sequence of blanks 

B... everything blank to the right 

...B everything blank to the left 

Ww nonempty word consisting of nonblanks 

xX W,BW’B ... W,,(1 > 1), a sequence of nonempty words of nonblanks, 
separated by blanks 


Underlining will indicate the scanned symbol. 
More Examples of Turing Machines Defined by Diagrams 


3. y (tight-end machine). See Figure 5.9. 


~ XBB=> ~ XBB 


Squares on the rest of the tape are not affected. The same assump- 
tion is made in similar places below. When the machine 7 begins 
on a square preceding a sequence of one or more nonempty words, 
followed by at least two blank squares, it moves right to the first of 
those blank squares and stops. 


4. + (left-end machine). See Figure 5.10. 
BBX ~ => BBX ~ 
5. T (left-translation machine). See Figure 5.11.* 
~ BWB = ~ WBB 


This machine shifts the whole word W one square to the left. 


* There is a separate arrow from 1’ to each of the groups on the right and a separate arrow from 
each of these, except lay, back to r?. 
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#0 


0 
FIGURE 5.10 
FIGURE 5.11 
Aa ee 
ls 
T 
FIGURE 5.12 


6. o (shift machine). See Figure 5.12. 


BW,BW>B > BWB...B 


In the indicated situation, W, is erased and W, is shifted leftward so 
that it begins where W, originally began. 


7. C (clean-up machine). See Figure 5.13. 


~ BBXBWB => ~ WB...B 


8. K (word-copier). See Figure 5.14. 


BWB...=> BWBWB... 
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Al — rPo 
iP 
TAIT 
FIGURE 5.13 
P 
0 
agP?a,A7a, 
Ar ao 
agPa,A7a,, 
FIGURE 5.14 


9. K,, (n-shift copier). See Figure 5.15. 


BW, BW,,1B ... WiB ... > BW, BW,iB ... WiBW,B ... 


Exercises 
5.10 Find the number-theoretic function f(x) computed by each of the fol- 
lowing Turing machines. 
a. la, 
b. Figure 5.16 
c. PKAa,A(ra,)” 


5.11 Verify that the given functions are computed by the indicated Turing 
machines. 


a. |x-—y| (Figure 5.17) 
b. x+y Pa,A(ra,)* 
c. x-y (Figure 5.18) 


5.12 Draw diagrams for Turing machines that will compute the following 
functions: (a) max(x, y) (b) min(x, y) (©) x= y (d) [x/2]. 
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a 


n+1 n+1 
agP"**a,A”* a, 


4 
_o 


agP”*1a,.A™*ta,. 
FIGURE 5.15 
#0 
agr = ____ 
——————> a)Ira, 
0 
FIGURE 5.16 
1 
agt —> P? 1ajl —>» Aa, 
| 1 
Mr 
FIGURE 5.17 


1 
Ope eee 
ae 
1 2 
Pra,rC r —+la,A(rag)“P 


t | 


FIGURE 5.18 
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5.13 Prove that, for any Turing machine 7 with alphabet {ap, ..., a;}, there is 
a diagram using the Turing machines r, 1, ag, ..., a, that defines a Turing 
machine .7such that 7and .v have the same effect on all tapes. (In fact, 

y can be defined so that, except for two additional trivial initial moves 
left and right, it carries out the same computations as 7) 


5.3 Partial Recursive Functions: Unsolvable Problems 


Recall, from Section 3.3, that the recursive functions are obtained from the 
initial functions (the zero function Z(x), the successor function N(x), and the 
projection functions U;'(x1,...,X;)) by means of substitution, recursion, and 
the restricted p-operator. Instead of the restricted p)-operator, let us introduce 
the unrestricted \1-operator: 


Tf f (1, ---, Xn) = MY (G(X, «+ Xn, Y) =O) 
=the least y such that 9(%4,..., Xn, y)=0 
then f is said to arise from g by means of the unrestricted p-operator. 


Notice that, for some %,, ..., X,, the value f(x, ..., x,,) need not be defined; this 
happens when there is no y such that g(x, ..., X,, y) = 0. 

If we replace the restricted p-operator by the unrestricted p-operator in 
the definition of the recursive functions, we obtain a definition of the par- 
tial recursive functions. In other words, the partial recursive functions are 
those functions obtained from the initial functions by means of substitution, 
recursion and the unrestricted !-operator. 

Whereas all recursive functions are total functions, some partial recursive 
functions will not be total functions. For example, py(x + y = 0) is defined 
only when x = 0. 

Since partial recursive functions may not be defined for certain arguments, 
the definition of the unrestricted )-operator should be made more precise: 


uy(g(X1,..-, Xn, Y) =0)=k means that, for O<u<k, 
G(X, ..-, Xn»,U) is defined and g(x, ..., X,,u)#0, and 
G(X, 2+, Xn, k) =0 


Observe that, if R(x, ..., x,, y) is a recursive relation, then py(R(v, ..., XY) 
can be considered an admissible application of the unrestricted 1-operator. 
In fact, py(R(Xy, «--, Xp Y)) = HY(Ce(Xy, «+, Xp, Y) = 0), where C, is the characteristic 
function of R. Since R is a recursive relation, Cz is, by definition, a recursive 
function. 
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Exercises 


5.14 Describe the following partial recursive functions. 
a py(x+y+1=0) 
b. py(y > x) 
c. py(y + x = x) 
5.15 Show that all recursive functions are partial recursive. 


5.16 Show that every partial function whose domain is a finite set of natural 
numbers is a partial recursive function. 


It is easy to convince ourselves that every partial recursive function 
f(xy, ..., X,) is computable, in the sense that there is an algorithm that computes 
f(xy ..., X,) when f(x, ..., X,) is defined and gives no result when f(x, ..., X,) 
is undefined. This property is clear for the initial functions and is inher- 
ited under the operations of substitution, recursion and the unrestricted 
p-operator. 

It turns out that the partial recursive functions are identical with the 
Turing-computable functions. To show this, it is convenient to introduce a 
different kind of Turing-computablility. 

A partial number-theoretic function f(x, ..., x, is said to be standard Turing- 
computable if there is a Turing machine 7 such that, for any natural numbers 
k,, ..., k, the following holds. 

Let Bk,Bk>B... Bk,, be called the argument strip.* Notice that the argument 
strip is B(k, ..., k,). Take any tape containing the argument strip but with- 
out any symbols to the right of it. (It may contain symbols to the left.) The 
machine 7 is begun on this tape with its reading head scanning the first | 
of k,. Then 


1. .7stops if and only if f(k, ..., k,) is defined. 
2. If 7 stops, the tape contains the same argument strip as before, 
followed by Bf (ki, ..., k,). Thus, the final tape contains 


BK,BK,B ... Bk,Bf (ki, ..., kn) 


Moreover: 
3. The reading head is scanning the first | of f(ki,..., kn). 
4. There is no nonblank symbol on the tape to the right of f(y, ..., kn). 


5. During the entire computation, the reading head never scans any 
square to the left of the argument strip. 


* For a function of one variable, the argument strip is taken to be Bhi. 
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For the sake of brevity, we shall say that the machine 7 described above 
ST-computes the function f(x;, ..., x,). 

Thus, the additional requirement of standard Turing computability is that 
the original arguments are preserved, the machine stops if and only if the 
function is defined for the given arguments, and the machine operates on 
or to the right of the argument strip. In particular, anything to the left of the 
argument strip remains unchanged. 


Proposition 5.1 


Every standard Turing-computable function is Turing-computable. 


Proof 


Let 7 be a Turing machine that ST-computes a partial function f(x, ..., X,). 
Then f is Turing-computable by the Turing machine 7 PC. In fact, after 7 
operates, we obtain Bx,B... Bx, Bf (x1, ..., Xn), with the reading head at the 
leftmost | of f(%1,..., Xn). P then moves the reading head to the right of 


f (x1, ..., Xn), and then C removes the original argument strip. 


Proposition 5.2 
Every partial recursive function is standard Turing-computable. 


Proof 


a. Pra, ST-computes the zero function Z(x). 

b. The successor function N(x) is ST-computed by PKa,Ar. 

c. The projection function U}'(x1, ..., X») =x; is ST-computed by .“K,,_;,,Ar. 

d. (Substitution.) Let fx, ..., X,) = @iXy ., Xy), oy My (Xy -., X,)) and 
assume that 7 ST-computes g and; SI-computes h; for 1 <j < m. Let 
4, be the machine .; Po"(K,,,;)"A"r. The reader should verify that f is 
ST-computed by 


AP(Kyst)"A't 2.422. Sint FmPO"A"t 7 Po" Ar 


We take advantage of the SI-computability when, storing %,..., Xy, 
My (1, 06, Xn),--+, Hi(%1, ..., X,) on the tape, we place (x1, ..., X,) onthe tape 
to the right and compute H,;(%1 ,..., X,) without disturbing what we have 
stored on the left. 
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rayrK, (K,,,3)"A"*lagr Pr YPK,,,5 lag] 
X 


1 
0 


1 
r(K,49)" rar K,,,3 A”**r F PK,,,4 lagl ———> r(K,,44)"*1 


| 


FIGURE 5.19 
e. (Recursion.) Let 


f(X1, 0.07 Xn, 0) =8(X1, 0.) Xn) 
F(X, 0 Xn YH) = hh, 2 Xn Wy F(X, 00) Xn Y)) 


Assume that .7 ST-computes g and 7 ST-computes h. Then the reader 


should verify that the machine in Figure 5.19 ST-computes f 


f. Unrestricted -operator. Let f(x,, ... X,) = py(g(y ---, Xp Y) = 0) and assume 
that 7 ST-computes g. Then the machine in Figure 5.20 ST-computes f. 


Exercise 
5.17 For arecursion of the form 


f(0)=k 
fy+)=hy, fy) 


show how the diagram in Figure 5.19 must be modified. 


Corollary 5.3 


Every partial recursive function is Turing-computable. 


1 
Fra A" gr ———+ Prajrolagl 
0 


laolAr 


FIGURE 5.20 
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Exercise 


5.18 Prove that every partial recursive function is Turing-computable by a 
Turing machine with alphabet {ag, aj}. 


In order to prove the converse of Corollary 5.3, we must arithmetize 
the language of Turing computability by assigning numbers, called Gédel 
numbers, to the expressions arising in our study of Turing machines. “R” 
and “L” are assigned the Gédel numbers 3 and 5, respectively. The tape 
symbols a; are assigned the numbers 7 + 4i, while the internal state sym- 
bols q; are given the numbers 9 + 4i. For example, the blank B, which is ag, 
receives the number 7; the stroke |, which is a,, has the number 11; and the 
initial internal state symbol q, has the number 9. Notice that all symbols 
have odd Gédel numbers, and different symbols have different numbers 
assigned to them. 

As in Section 3.4, a finite sequence Up, U,, ..., U, of symbols is assigned the 
Godel number p§“'p8 ... p§, where Py, Py, Py --. are the prime numbers 
2, 3,5, ... in ascending order and g(u,) is the Gddel number assigned to u;. For 
example, the quadruple qoaya,qy receives the G6del number 273517". 

By an expression we mean a finite sequence of symbols. We have just shown 
how to assign ak numbers to expressions. In a similar manner, to any finite 
sequence Ey, E,, ..., E,, of expressions we assign the number p§“’ ps... pg". 
For example, this assigns Gédel numbers to finite sequences of ‘Turing 
machine quadruples and to finite sequences of tape descriptions. Observe 
that the Gédel number of an expression is even and, therefore, different from 
the Gédel number of a symbol, which is odd. Moreover, the Gddel number 
of a sequence of expressions has an even number as an exponent of p, and is, 
therefore, different from the Gddel number of an expression, which has an 
odd number as an exponent of pp. 

The reader should review Sections 3.3 and 3.4, especially the functions x), 
(x), and x « y. Assume that x is the Gddel number of a finite sequence Wo, 
Wy... Wy that is, x= pps |, pf, where g(w)) is the Gédel number 
of Wi. Recall that Ax) =k + 1, the length of the sequence, and (x); = ¢(w;), the 
Gédel number of the jth term of the sequence. If in addition, y is the Gadel 
number of a finite sequence Vy, V;, ..., V,, then x + y is the Gédel number of the 
juxtaposition Wo, Wy ..., Wy Vo, Vp « ns of the two sequences. 


Proposition 5.4 


The following number-theoretic relations and functions are primitive recur- 
sive. In each case, we write first the notation for the relation or function, then, 
the intuitive interpretation in terms of Turing machines, and, finally, the 
exact definition. (For the proofs of primitive recursiveness, use Proposition 
3.18 and various primitive relations and functions defined in Section 3.3. 
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At a first reading, it may be advisable to concentrate on just the intuitive 
meanings and postpone the technical verification until later.) 
IS(x): x is the G6del number of an internal state symbol q,;: 


(Au) yex(x =9 + 4u) 


Sym(x): x is the Gédel number of an alphabet symbol a,: 


(Au) yex(x = 7 + 4u) 
Quad(x): x is the Gédel number of a Turing machine quadruple: 
Ax) = 4 A TS((x)o) A Sym((x)1) A IS((x)3) A [Sym((x)2) v (x)2 = 3v (X)2 = 5] 


TM(x): x is the Géddel number of a Turing machine (in the form of a finite 
sequence of appropriate quadruples): 


(VU) reer) Quad((X)y) AX > LA (WU)ne (x) (VO)ve2(xy(U #V 
= [((X)u)o #((X)o)o V((X)u)1 = ((X)o)] 


TD(x): x is the Géddel number of a tape description: 


x>1a (Vu) u<e(xy IS((x) x) Vs Sym((x),,)] A (FW) nee I S((X) u) 
A (VU) uce(x) IS((x)u) > U+1 < A(x) 


Cons(x, y, z): x and y are Gédel numbers of tape descriptions « and £, and z 
is the Gddel number of a Turing machine quadruple that transforms « into p: 


TD(x) A TD(y) A Quad(z) A (SW) w<(2)1LIS(() 0) 
A(X)w =(Z)o A(X)w11 = (ZA 
I ([Sym((Z)2) A (Y)ws = (Z)2 A (Y)w = (Z)3 A A(x) = 4(y) 
AV Unc # WAUEWF1> (X)n =(Y)WIV 
[(Z)2 =3A (Y)w = (X) 41 A(Y)w+1 = (Z)s A 
(VW)neqx(U # WAUFWH1>S(Y)y =(X)u)A 
(lw+2<A(x)a ay) =4(x)]v[w+2=A(x)a 
AY) = AX)F1A(Y)ox2 =7)I1V 
[(2)2 =5 A {lw #0 A Yue = Za A (Yo = (Dot 
ANALY) = AXA Wucy(utw-lautzw> 
(Y)u =(x)w)) V [w= 0A (y)o =(Z)3 AY) =7 A 
u(y) = U(X) +1Aa (Vu)ocucn(x (Yuet = (x) HDI 


Ill 
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I corresponds to a quadruple qja,a,q,, II to a quadruple q,a;Rq,, and III to a 
quadruple q,a;Lq,. 

NTD(x): x is the Gédel number of a numerical tape description—that is, a 
tape description in which the tape has the form E,kE2, where each of E, and 
E, is empty or consists entirely of blanks, and the location of the reading 
head is arbitrary: 


TD(X) A (VU) nex) (Sy¥M((X)u) > (X)u = 7 V (X)u = 11) 
ACWW) ner (2) (VO ocx (VW) wex(x)(U <V0AVU<W A(x)y =11A 
(X)w =11> (x) #7) A AWne cy ((X)u = 1D 


Stop (x, z): z is the Gddel number of a Turing machine 7 and x is the Gédel 
number of a tape description « such that .7stops at a: 


TM(z) A TD(x) A AQ) nq TS(()u) A (AP )o<xt2)(((Z)o)o0 
=(X)u A((Z)o)1 = (%)us1)] 


Comply, 2): z is the Gédel number of a Turing machine 7 and y is the Gédel 
number of a computation of .7: 


¥ > LATM(2Z) A (Vtt)nceyy TD((Y)u) A Stop((Y) yy-1,2) A 
(Vu)ucay)1(A W)w<(z)COnS((Y) x , (Y)u+1, (Z)w) A 
(VO )o<e(yyo)AS(((Y)o vo) = ((Y)o)o = 9) 


Num(x): The Gédel number of the word ¥—that is, of |?*!: 


Num(x)=| [pi 


usx 


TR(x,, ..., X,): The Godel number of the tape representation (1, ..., X,) of the 
n-tuple (x1, ..., X,): 


TR(x1,...,X,) = Num(x,)*2’ * Num(x,)*2’ *...#2” *Num(x,) 


U(y): If y is the Gédel number of a computation that results in a numerical 
tape description, then U(y) is the number represented on that final tape.* 


Uy=| >) sg(I(Yewadu-11]) +1 


u<z((y)eh(y)=1) 


* If y is not the Gédel number of a computation that yields a numerical tape description, U(y) 
is defined, but its value in such cases will be of no significance. 
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[Let w be the number, represented by |”*!, on the final tape. The calculation 
of U(y) tallies a 1 for every stroke | that appears on the final tape. This yields 
asum of w+ 1, and then 1 is subtracted to obtain w.] 

TZ, Xp «+ XY): y is the Gédel number of a computation of a Turing 
machine with Gédel number z such that the computation begins on the tape 
(x1, ..., Xn), with the reading head scanning the first | in %1, and ends witha 
numerical tape description: 


Comp(y,z)A (y)o = 2° * TR(X1, «.., Xn) ANTD((y) ty)1) 


When n = 1, replace TR(x,, ..., x,) by Num(x,). (Observe that, if T,,(Z, x1, «.-, Xv Ya) 
and T,(Z, Xy, --+, Xj,» Yo), then ae Y», since there is at most one computation of a 
Turing machine starting with a given initial tape.) 


Proposition 5.5 


If.7is a Turing machine that computes a number-theoretic function f(x, ..., X,) 
and eis a Godel number of % then* 


f(%1, ..-, Xn) =U(uyT,,(€, X41, ..-, Xn/Y)) 


Proof 


Let k,, ..., k, be any natural numbers. Then f(k,, ..., k,) i . defined if and only 
if there is a computation of 7 beginning with (k,,..., k,) and ending with 
a eae tape description—that is, if and only if (Ay)T,,(e, ky, ..., X, Y). SO, 
fk, ... k,) is defined if and only if pyT,,(e, Me ..., Ky, y) is defined. Moreover, 
when f(k,, ..., k,,) is defined, pyT,,(e, k,, ..., k,, y) is the Gédel number of a 
computation of .7 beginning with (4, ..., k,) and Cine, k,, ...,k,, y)) is the 
value yielded by the computation, namely, f(k,, .. 


ad k,, 


Corollary 5.6 


Every Turing-computable function is partial recursive. 


Proof 


Assume f(x, ..., X,) is Turing-computable by a Turing machine with Gédel 
number e. Then f(x, ..., x,) = U(uyL.@ xy ... X, y)). Since T,, is primitive 


* Remember that an equality between two partial functions means that, whenever one of them 
is defined, the other is also defined and the two functions have the same value. 
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recursive, py1,(e, X1, ..., X,Y) is partial recursive. Hence, U(uyT,,(e, Xp, «.- Xp Y) 
is partial recursive. 


Corollary 5.7 


The Turing-computable functions are identical with the partial recursive 
functions. 


Proof 


Use Corollaries 5.6 and 5.3. 


Corollary 5.8 
Every total partial recursive function is recursive. 


Proof 


Assume that the total partial recursive function f(x, ..., x,) is Turing- 
computable by the Turing machine with Gédel number e. Then, for all x, ..., 
Xp AYTC, Xp. Xp Y). Hence, pyT,,(e, x1, ..., X,Y) is produced by an application 
of the restricted |!-operator and is, therefore, recursive. So, U(pyI,@ Xp ..., 
Xy, Y)) is also recursive. Now use Proposition 5.5. 


Corollary 5.9 


For any total number-theoretic function f, f is recursive if and only if f is 
Turing-computable. 


Proof 


Use Corollaries 5.7-5.8 and Exercise 5.15. 

Church’s thesis amounts to the assertion that the recursive functions 
are the same as the computable total functions. By Corollary 5.9, this is 
equivalent to the identity, for total functions, of computability and Turing 
computability. This strengthens the case for Church’s thesis because of the 
plausibility of the identification of Turing computability with computabil- 
ity. Let us now widen Church’s thesis to assert that the computable func- 
tions (partial or total) are the same as the Turing-computable functions. By 
Corollary 5.7, this implies that a function is computable if and only if it is 
partial recursive. 
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Corollary 5.10 


Any number-theoretic function is Turing-computable if and only if it is stan- 
dard Turing-computable. 


Proof 


Use Proposition 5.1, Corollary 5.6, and Proposition 5.2. 


Corollary 5.11 (Kleene’s Normal Form Theorem) 


As Zz varies over all natural numbers, U(pyT,,(z, x1, ..., X,, y)) enumerates with 
repetitions all partial recursive functions of n variables. 


Proof 


Use Corollary 5.3 and Proposition 5.5. The fact that every partial recur- 
sive function of n variables reappears for infinitely many z follows from 
Exercise 5.7. (Notice that, when z is not the Gédel number of a Turing 
machine, there is no y such that T,,(z, x1, ..., x,, y), and, therefore, the corre- 
sponding partial recursive function is the empty function.*) 


Corollary 5.12 


For any recursive relation R(x,, ..., X,, y), there exist natural numbers Z, and vy 

such that, for all natural numbers x,, ..., x, 
a. (AY)R(X, ..., X,Y) if and only if YT, (Zy Xp. Xp Y) 
b. (WY)R(X, ..., X,Y) if and only if (WyY)-T,(U9, Xp). XY) 


Proof 


a. The function f(x, ..., x,) = pyR(,, ..., X,, y) is partial recursive. Let z, 
be a Gédel number of a Turing machine that computes f Hence, f(x,, 
..., X,) is defined if and only if (Ay)T,,(Zo, x, ..., X y). But f(x, ..., x,) is 
defined if and only if Gy)R(%, ..., X,Y). 

b. Applying part (a) to the recursive relation “R(x, ..., x,, y), we obtain 
a number v, such that: 


(Sy)AR(x1, ..., X,Y) if and only if (Ay)T (U0, %1, ---, Xn/Y) 


Now take the negations of both sides of this equivalence. 


* The empty function is the empty set @. It has the empty set as its domain. 
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Exercise 


5.19 Extend Corollary 5.12 to two or more quantifiers. For example, if R(x, ..., 
XY, Z) is a recursive relation, show that there are natural numbers Z) 
and vy such that, for all x,, ..., x, 


a. (Vz)(AYy)R(X,, ..., Xv Y, Z) if and only if (Vz)(AY)Tha(Zo, Xi, +6) Nw ZY): 
b. (Az)(Vy)R(Xy, ..., X,Y, Z) if and only if (Az)(WY)-T,,44(U9, Xp, «+ Xa Z, Y)- 


Corollary 5.13 


a. (AY)T,,(X1, X4, Xo, «--, Xy Y) is not recursive. 
b. (AY)T,(Z, Xp «+» Xp Y) is not recursive. 


Proof 


a. Assume (4y)T,, (1, X1, X2, ---, X Y) is recursive. Then the relation —(Ay) 
Ty Xp Xy -+7 Xp Y) AZ =Z is recursive. So, by Corollary 5.12(a), there 
exists Z) such that 


(Az)(AGy)T,(%1, 41, X20, see Xn,Y) AZ= Zz) if and only if 
(4z)T,,(Z0,%1,%X2, seey Xn,Z) 


Hence, since z obviously can be omitted on the left, 


(Sy )T(%1,%1,X2, -.-, Xn, Y) if and only if (Az)T;, (Zo, %1,X2, ---, XnrZ) 


Let x, =X, = +++ =X, = Zp. Then we obtain the contradiction 


—(Sy)T (Zo, 20,20, ---, Zo, Y) if and only if (4z)T;,(Z0,Z0,Zo, ---, 20,2) 


b. If (Ay)T,,(Z, Xp Xo, --» Xp Y) were recursive, so would be, by substitu- 
tion, (Ay)T,,(%1, X1 Xy ...,X, Y), contradicting part (a). 


Exercises 


5.20 Prove that there is a partial recursive function g(z, x) such that, for 
any partial recursive function f(x), there is a number Z, for which f(x) = 
(Zp, x) holds for all x. Then show that there must exist a number v) such 
that g(¥p, Vo) is not defined. 


5.21 Let h(x, ..., X), --, A(Xy ..., X,) be partial recursive functions, and let 
RiXy, -0, Xp ++, Rely, ..., X,) be recursive relations that are exhaustive 
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5.22 


5.23 


5.24 


5.25 
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(i.e, for any x1, ..., X,, at least one of the relations holds) and pairwise 
mutually exclusive (i.e., for any x,, ..., X,, no two of the relations hold). 
Define 


Ww 


hy (x1, ..., Xn) if Ri(x1, ..., Xn) 
9X1, sees Xn) = : 
Ni (X1, 00, Xn) Hf Re (1, .-, Xn) 
Prove that g is partial recursive. 
A partial function f(x) is said to be recursively completable if there is a 
recursive function /(x) such that, for every x in the domain of f, h(x) = f(x). 
a. Prove that py T,(x, x, y) is not recursively completable. 
b. Prove that a partial recursive function f(x) is recursively complet- 
able if the domain D of f is a recursive set—that is, if the property 
“x € D” is recursive. 
c. Find a partial recursive function f(x) that is recursively completable 
but whose domain is not recursive. 
If R(x, y) is a recursive relation, prove that there are natural numbers Z) 
and v, such that 


a. [AYRE y) > (VY) -T.Eo Zo YI 

b. =lWWROo y) & AYTCo Po YI 

If S(x) is a recursive property, show that there are natural numbers Z, 
and v, such that 

a. -[S(Z) = (Wy) ~Ty(Zo, Zo, y)] 

b. -[S@) > GY)T,@y vo YI 

Show that there is no recursive function B(z, xj, ..., x,,) such that, if zis a 
Gédel number of a Turing number 7 and k,, ..., k,, are natural numbers 
for which f ,,, (k,, ...,k,,) is defined, then the number of steps in the com- 
putation of f ,,,(k,, ...,k,) is less than B(z, ky, ..., k,). 


wey Ky 


Let 7 be a Turing machine. The halting problem for 7 is the problem of 
determining, for each tape description B, whether 7 is applicable to B, that 
is, whether there is a computation of .7 that begins with p. 

We say that the halting problem for _7 is algorithmically solvable if there is 
an algorithm that, given a tape description B, determines whether 7 is appli- 
cable to f. Instead of a tape description B, we may assume that the algorithm 
is given the Gédel number of f. Then the desired algorithm will be a comput- 
able function H -such that 


0 if x is the Godel number of a tape description B 
A ,(x)= to which 7 is applicable 
1 otherwise 


Computability 337 


If we accept Turing algorithms as exact counterparts of algorithms (that is, 
the extended Church’s thesis), then the halting problem for .7 is algorithmi- 
cally solvable if and only if the function H is Turing-computable, or equiva- 
lently, by Corollary 5.9, recursive. When the function H , is recursive, we say 
that the halting problem for 7 is recursively solvable. If H - is not recursive, we 
say that the halting problem for .7 is recursively unsolvable. 


Proposition 5.14 


There is a Turing machine with a recursively unsolvable halting problem. 


Proof 


By Proposition 5.2, let 7 be a Turing machine that ST-computes the partial 
recursive function pyT‘(x, x, y). Remember that, by the definition of standard 
Turing computability, if ~ is begun on the tape consisting of only x with its 
reading head scanning the leftmost |, then ~stops if and only if pyT,(x, x, y) 
is defined. Assume that 7 has a recursively solvable halting problem, that is, 
that the function H ,is recursive. Recall that the Gédel number of the tape 
description qox is 2? * Num(x). Now, 


(ay)T;(x,x,y) ifandonlyif wyT)(x,x,y)is defined 
ifandonlyif 7, begun on qx, performs acomputation 
ifandonlyif H, (2° *Num(x)) =0 


Since H ,, Num, and * are recursive, (Ay)I;(x, x, y) is recursive, contradicting 
Corollary 5.13(a) (when n = 1). 


Exercises 


5.26 Give an example of a Turing machine with a recursively solvable halt- 
ing problem. 


5.27 Show that the following special halting problem is recursively unsolvable: 
given a Gédel number z of a Turing machine .7 and a natural number 
x, determine whether _7 is applicable to qoX. 


5.28 Show that the following self-halting problem is recursively unsolvable: 
given a Gédel number z of a Turing machine ., determine whether .~ is 
applicable to qoz. 

5.29 The printing problem for a Turing machine 7 and a symbol a, is the 
problem of determining, for any given tape description a, whether 7 
begun on a, ever prints the symbol a,. Find a Turing machine .7 and a 
symbol a, for which the printing problem is recursively unsolvable. 
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5.30 Show that the following decision problem is recursively unsolvable: 
given any Turing machine ., if .7 is begun on an empty tape, deter- 
mine whether .7 stops (that is, whether .7 is applicable to qB). 

5.31>Show that the problem of deciding, for any given Turing machine, 
whether it has a recursively unsolvable halting problem is itself recur- 
sively unsolvable. 


To deal with more intricate decision problems and other aspects of the 
theory of computability, we need more powerful tools. First of all, let us 
introduce the notation 


2 (X11, -.-, Xn) =U(MyT, (Z, X1, «.-, Xn Y)) 


Thus, by Corollary 5.11, 99, 97,@32, ... is an enumeration of all partial recur- 
sive functions of n variables. The subscript j is called an index of the func- 
tion 9}. Each partial recursive function of n variables has infinitely many 
indices. 


Proposition 5.15 (Iteration Theorem or s-m-n Theorem) 


For any positive integers m and n, there is a primitive recursive function 
Sn (Z,Y1, +--+ Ym) such that 


m+n 
Qz (x1, cons Xn V1, cons Ym) = Dm (2, rn) C1 coy Xn) 


Thus, not only does assigning particular values to z, y, ..., Y,, in 
2 "(X1, ---, Xnv Yay «+7 Ym) yield a new partial recursive function of n vari- 
ables, but also the index of the resulting function is a primitive recursive 
function of the old index z and of y;, ..., Y,. 


Proof 
If ~ is a Turing machine with Gédel number z, let %, ..., y,,be a Turing 
machine that, when begun on (%,..., X,), produces (%1, ..., Xn, Wty + Ym)y 


moves back to the leftmost | of x1, and then behaves like ~ Such a machine 
is defined by the diagram 


Re(aqr)""'r(ayr)? "8... ela)!" og 7 


The Gédel number s7'(Z, 1, ..., Ym) of this Turing machine can be effectively 
computed and, by Church’s thesis, would be partial recursive. In fact, s,' can 
be computed by a primitive recursive function g(Z, y1, ..., Y,) defined in the 
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following manner. Let t = y, + --- + Y,, + 2m + 1. Also, let u(i) = 29*43’5U79+4 
and v(i) = 2°*#315°715+41, Notice that u(i) is the Gédel number of the quadru- 
ple q,B|q; and v(i) is the Gédel number of the quadruple q;|Rq;,.. Then take 


9, Ya coy Yn) to be 


99115399 499763913 9139115399 91397 57717 
peak Bed e828 87. ]* 


yy +2 


u(i) v(i) 29+ 4(y1 +3)375379+4(y1 +4) 
| Pioi_alP\2i-3| * 2 fe 
i=2 


yy ty2+4 


u(i) v(i) 
Pri-(y, +4)P2h(y, +441 * 


i=y, +4 


gen +99 +5)375379+4(y, +Y2+6) z 
grin tetym-1 +2m-1)37.5379+4(y, +e-+Ym-1+2m) # 


YytetYn+2m 
u(i) o(7) * 
Pai-(y, tet Ym-1 +2m)|P2\i-(y +e+Ym—1+2m) +1 
i=yy t+ Ym-1t2m 


2 3 


9+4(t+1)97 5359+4(t+2) 9+4(t+2)97 5399+4(t+3) 
72 3°5°7 117 3°5°7 * 


9+dtgllp579+4t DSTO ona geass 


8(A(z)) 
2((2)1)0+4(H3) 32) 1 5(@i)27(2)i)3+4(4+3) 
1 


i=0 


g is primitive recursive by the results of Section 3.3. When z is not a Gédel 
number of a Turing machine, 92" is the empty function and, therefore, 
Sn (Z,Y1,---» Ym) Must be an index of the empty function and can be taken to 
be 0. Thus, we define 


G(Z, Yay eer Ym) — if TM(z) 


Sn (Z, Yay veer Ym) = ‘ 
Gy Yn) 0 otherwise 


Hence, s; is primitive recursive. 


Corollary 5.16 


For any partial recursive function f(«,, ..., X%y Yu »+-, Yn), there is a recursive 
function g(Y1, ..., Y,,) such that 


f(%1, sees Xn Vi, sees Ym) = Poy yeeym) X17 sees Xn) 
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Proof 


Let e be an index of f. By Proposition 5.15, 


m+n 


Oe (x1, sees XnY1, sees Ym) = Ose yy ym) X17 sees Xn) 


Let 9(Y1, .--) Ym) = $n (C,Y1, «++ Ym). 


Examples 


1. Let G(x) be a fixed partial recursive function with nonempty 


domain. Consider the following decision problem: for any u, deter- 
mine whether @;, =G. Let us show that this problem is recursively 
unsolvable, that is, that the property R(u), defined by 0, =G, is not 
recursive. Assume, for the sake of contradiction, that R is recursive. 
Consider the function f(x, u) = G(x) - N(Z(pyT,(u, u, y))). (Recall that 
N(Z(t)) = 1 for all f). Applying Corollary 5.16 to f(x, u), we obtain a 
recursive function g(u) such that f(x,u)= w(x). For any fixed 
U, gw =G if and only if (Ay)T,(u, u, y). (Here, we use the fact that G 
has nonempty domain.) Hence, (Ay)T,(u, u, y) if and only if R(g(u)). 
Since R(g(u)) is recursive, (Ay)T,(u, u, y) would be recursive, contra- 
dicting Corollary 5.13(a). 


. A universal Turing machine. Let the partial recursive function 


U(pyl(z, x, y)) be computed by a Turing machine 7 with Gédel num- 
ber e. Thus, U(pyT,(z, x, y)) = U(pyT,le, z, x, y)). 7 is universal in the 
following sense. First, it can compute every partial recursive func- 
tion f(x) of one variable. If z is a Gédel number of a Turing machine 
that computes f, then, if 7 begins on the tape (z,x), it will compute 
U(pyT,(z, x, y)) = f(x). Further, 7 can be used to compute any partial 
recursive function h(x, ..., x,,). Let J) be a Gddel number of a Turing 
machine that computes h, and let f(x) = H(x)y 1, -- (%),4). Then 
h(x, ..., Xn) =f (ps ae pi). By applying Corollary 5.16 to the partial 
recursive function U(pyT,,(%, (Xo, Xp +.» nv Y)), we obtain a recur- 
sive function g(v) such that U(uyT,(0,(X)o,(X)1, 67 (X)n-1/Y)) = Pew)(X)- 
Hence, f(x) = @4(0)(x). So h(x, ..., X,) is computed by applying 7 to the 


tape (go), Dgex Pia). 


Exercises 


5.32 Find a superuniversal Turing machine 7* such that, for any Turing 


machine 7% if z is a G6del number of .7 and x is the Gddel number of 
an initial tape description a of .7 then 7 * is applicable to qo(z, x) if and 
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only if 7 is applicable to «; moreover, if 4 when applied to a, ends 
with a tape description that has Gédel number w, then 7 *, when applied to 
qo(Z, xX), produces w. 


5.33 Show that the following decision problem is recursively unsolvable: for 
any u and v, determine whether @;, = 9}. 


5.34 Show that the following decision problem is recursively unsolv- 
able: for any u, determine whether @;, has empty domain. (Hence, 
the condition in Example 1 above, that G(x) has nonempty domain is 
unnecessary). 


5.35 a. Prove that there is a recursive function 9(u, v) such that 
Pa(u,o)(X) a @iu(X)-M5 (x) 
b. Prove that there is a recursive function C(u, v) such that 


PC(u,0(X) = Pu (;(x)) 


5.4 The Kleene—Mostowski Hierarchy: 
Recursively Enumerable Sets 


Consider the following array, where R(X, ..., Xv Yp «+ Y,) is a recursive 
relation: 


R(X, «0, Xn) 
(Sy,)R(%1, aa 2 Xn V1) (Vy) R(%41, seer Xn V1) 
(Sy1)(Vy2)R(x1, seey Xn Y1,Y2) (Vy1)(Sy2)R(X1, seey XnrY1,Y2) 


(Sy1)(Vy2)(Sy3)R(X1, 0 Xn Yr, Y2, Ys) — (Wyr)(Ay2)(VY3)R(X1, «2 Xn Yt, Y2,Y3) 


Let © = Ig = the set of all n-place recursive relations. For k > 0, let X; be the 
set of all n-place relations expressible in the prenex form (4y,)(Vy2) ... (Q yy) 
R(Xy, «6 Xw Yu ---, Yd, Consisting of k alternating quantifiers beginning with an 
existential quantifier and followed by a recursive relation R. (Here, “(Q y;,)” 
denotes (4y,) or (Vy;), depending on whether k is odd or even.) Let II; be the 
set of all n-place relations expressible in the prenex form (Vy,)(Ay2) ... (Qy,) 
R(X, «6 Xv Yu «+ Yp, consisting of k alternating quantifiers beginning with 
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a universal quantifier and followed by a recursive relation R. Then the array 
above can be written 


x I 
> I 


This array of classes of relations is called the Kleene-Mostowski hierarchy, or 
the arithmetical hierarchy. 


Proposition 5.17 


a. Every relation expressible in any form listed above is expressible in all 
the forms in lower rows; that is, for all j > k, 


Ye ~ We] 
a a ae 


b. There is a relation of each form, except >, that is not expressible in the 
other form in the same row and, hence, by part (a), not in any of the 
rows above; that is, for k > 0, 


YT] #© and > a) 
k k k k 


c. Every arithmetical relation is expressible in at least one of these forms. 

d. (Post) For any relation Q(%,, ..., x,), Q is recursive if and only if both Q 
and —Q are expressible in the form (Ay, R(“,, ..., X», Y1), where R is recur- 
sive; that is, ©] OMY =X. 
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e. IfQ; e Yeand Q, € Yk, then Q, Vv Q, and Q, A Q, are in d;, If Q, e Wi and 


Q, elk, then Q, v Q, and Q, A Q, are in Ij. 


f. In contradistinction to part (d), if k > 0, then 


yall - yell #@D 


kel k+1 


Proof 


a. (AZ,)(Wy,) -.. (AZ QWYy JRO, 6 Xr Zy Yr 2 Zee Ui) 


2) 


(Vu)(Sz1 (V1) «.. (AZ Ve (RO, ey Xn Za ty over Zee Ya) AU HU) SD 
(Az )(Vy1) ... (AZ) (Wye )(Su(R(X1, 0, Xn Za Yt ey Zee Ye) AU =U) 


Hence, any relation expressible in one of the forms in the array is 
expressible in both forms in any lower row. 


. Let us consider a typical case, say )3. Take the relation (4v)(Vz)(Ay) 


Tya2(Xq, Xy, Xo «++, Xyy V, Z, Y), Which is in 13. Assume that this is in 113, 
that is, it is expressible in the form (Vv)(4z)(Vy)R(p ..., Xv U, Z, Y), 
where R is recursive. By Exercise 5.19, this relation is equivalent to (Vv) 
(Az)(Vy) AT ya2l@, Xp +++, Xp U, Z, Y) for some e. When x, = e, this yields a 
contradiction. 

Every wf of the first-order theory S can be put into prenex normal form. 
Then, it suffices to note that (4u)(4v)R(u, v) is equivalent to (4z)R((Z)o, 
(z),), and (Vu)(Vv)R(u, v) is equivalent to (Vz)R((Z)o, (Z)1). Hence, suc- 
cessive quantifiers of the same kind can be condensed into one such 
quantifier. 


. If Q is recursive, so is =Q, and, if P(x,, ..., x,) is recursive, then P(x,, ..., 


X,) > Ay)(P(y ... X,) A y = y). Conversely, assume Q is expressible as 
(Ay)Ry(xX1, ..., X,Y) and 7Q as (4y)R,(X1, ..., X,, y), where the relations R, 
and R, are recursive. Hence, (Vx,) ... (Vx,) (Ay)(Ritp ---) Gv Y) V Ro(x, 
wee Xp Y))- SO, W(X, 0 Xy) = WY (Ray, 0 Xv Y) V Roly, «+, Xp Y)) iS recur- 
sive. Then, Q(%1, ..., X,) @ Rity, --, Xn Wy, ---, X,)) and, therefore, Q is 
recursive. 

Use the following facts. If x is not free in «v. 


F(A LY V 2)S (vy V(AX)A), FANCY A 2) (ey A(AX).Z), 
E(VxX)YV A)S CY V(VX)A) E(x) AZ) Cy A(VX).Z) 
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f. We shall suggest a proof in the case n = 1; the other cases are then 
easy consequences. Let Q(x) < Li-IIt. Define P(x) as (Az) [(x = 2z A 
QQ) Vv (x = 2z +1 A ~Q(@)J. It is easy to prove that P ¢ >; UN and that 
P €Yi.1. Observe that P(x) holds if and only if 


(Az)(x = 2z A Q(Z)) Vv (AZ) zex(X = 224+ 1) A(V2)(x = 2z74+1 > AQ(Z))) 


Hence, P € T1j.,1 (Rogers, 1959). 


Exercises 


5.36 For any relation W of n variables, prove that W <>; if and only if 
W ell, where W is the complement of W with respect to the set of all 
n-tuples of natural numbers. 


5.37 For each k > 0, find a universal relation V, in "1; that is, for any rela- 
tion W of n variables: (a) if W ¢ X%, then there exists Z) such that, for all 
Xy voy Xp Wx, ..., X,) if and only if V,(Z, x, ..., x,); and (b) if W eT, 
there exists vy such that, for all x, ..., x, W(x, ..., x,) if and only if =V;,(v9, 
X4, ..., X,). [Hint: Use Exercise 5.19] 


The s-m-n theorem (Proposition 5.15) enables us to prove the following 
basic result of recursion theory. 


Proposition 5.18 (Recursion Theorem) 


If n > 1 and f(x, ..., x,) is a partial recursive function, then there exists a 
natural number e such that 


f (x1, seey Xn1,€) om oe (x1, seey Xn) 


Proof 


Let d be an index of f(x1, ..., X»-1/8n-1(%n/Xn))- Then 


f(x, sees ee ee e)) = @u(X4, sees Xn-1/Xn) 


By the s-m-n theorem, @7(%1, ..., Xn) = 2 (amy) susp Xn). Let e = 87-4(d,d). 
Then - 


f(x, see Xn-1,e) = f(x, sees Ki9584-4(d, 0) = a(x, seey Xn-1,d) 


n-1 
e 


= Ph aX coy Xn1) = 0) (x1, sees Xn-1) 
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Corollary 5.19 (Fixed-Point Theorem) 
If h(x) is recursive, then there exists e such that Oy = Dicey 


Proof 


Applying the recursion theorem to f(x,u) = @iqw(x), we obtain a number e 
such that f(x,e) = 1(x). But f(x,e) = Pjye)(X). 


Corollary 5.20 (Rice’s Theorem) (Rice, 1953) 


Let .7 be a set consisting of at least one, but not all, partial recursive func- 
tions of one variable. Then the set A = {u| @;, €.7} is not recursive. 


Proof 


By hypothesis, there exist numbers u, and u, such that u, € A and u, ¢ A. 
Now assume that A is recursive. Define 


h ju ifx¢A 
(x)= U ifxeA 


Clearly, h(x) € A if and only if x ¢ A. his recursive, by Proposition 3.19. By the 
fixed-point theorem, there is a number e such that 9: = Qj.) Then we obtain 
a contradiction as follows: 


eeA_ if and only if ger 
if and only if ie) €.7 
if and only if h(e)e A 
if and only if e¢A 


Rice’s theorem can be used to show the recursive unsolvability of various 
decision problems. 


Example 


Consider the following decision problem: for any u, determine whether @;, 
has an infinite domain. Let .7 be the set of all partial recursive functions of 
one variable that have infinite domain. By Rice’s theorem, {u | 0, € F} is not 
recursive. Hence, the problem is recursively undecidable. 

Notice that Example 1 on page 340 and Exercise 5.34 can be handled in the 
same way. 
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Exercises 


5.38 Show that the following decision problems are recursively unsolvable. 
a. For any u, determine whether On has infinite range. 
b. For any u, determine whether On is a constant function. 
c. For any u, determine whether ©, is recursive. 
5.39 a. Show that there is a number e such that the domain of @/ is {e}. 
b. Show that there is a number e such that the domain of 9; is  —{e}. 


5.40 This exercise will show the existence of a recursive function that is not 
primitive recursive. 


a. Let [vx ] be the largest integer less than or equal to vx. Show that 
[ve ] is defined by the recursion 


«(0) =0 
«(x +1) = K(x) +g | (x +1)— (K(x) +1)? | 


Hence, [ vx ] is primitive recursive. 

b. The function Quadrem(x) = x=[Vx]? is primitive recursive and 
represents the difference between x and the largest square less than 
or equal to x. 

c. Let px, y) = (x + yP + yP + x, pz) = Quadrem(z), and 
p2(z) = Quadrem([Vz]). These functions are primitive recursive. 
Prove the following: 


i. p; (p (x, y)) =x and pp (p (x,y) =. 
ii. pis a one-one function from 7 into o. 
iii. p,(0) = p,(0) = 0 and 


Pi(x +1) = pi(x)+1 


pa(x +1) = pa(x) jifpac+n}20 

iv. Let p? denote p, and, for n > 3, define p(x, ..., x,) = p(p"? 
(X1, .., Xp1), X,). Then each p” is primitive recursive. Define 
p?'(x) =p? '(p:(x)) for 1 <i <n — 1, and p4(x) = p2(x). Then each 
pi,lsisn, is primitive recursive, and p;(p"(X1,..., Xn)) =Xi- 
Hence, p” is a one-one function of w” into w. The p”s and the p;'s 
are obtained from p, p,, and p, by substitution. 


d. The recursion rule (V) (page 174) can be limited to the form 
F(x, ..., Xn41,0)=Xna1 (n= 0) 


F(x, sees Xn, y+) = G(xX1, sees Xnei,Y, F(x, sees Xn+1,Y)) 
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[Hint: Given 


f(X1, «-+7 Xn, 0) = (4X1, «+, Xn) 
F(X, ee Xn YD) = h(X1, 0 Xn Ys fH, 0 Xn Y)) 


define F as above, letting Gt, ..., Xp YZ) = W(Xy -- Xp Y, 2). Then 
fOr oe Xa Y) = Ey oy Xa Gy oe Xn) YA] 

e. Taking x + y,x-y, and [Vx ] as additional initial functions, we can 
limit the recursion rule to the one-parameter form: 


F(x,0) =G(x) 
F(x,y+1)=H(x,y,F(x,y)) 


[Hint: Let n > 2. Given 


f(X1, 0-+7 Xn, 0) = B(X1, 0 Xn) 
f(«, see Xn Yt = h(x, see Xn Y, f(X1, see Xn/Y)) 


let F(u, y) = f(pi(u), ..., pu(u),y). Define F by a permissible recursion. 
(Note that 5(x),x~y,p", and p; are available.) f(x, .... X, y) = 
F(p"(Xy, -- Xn) Y)-J 

f. Taking x + y, x - y, and [Vx] as additional initial functions, we can 
use hi(y, F(x, y)) instead of H(x, y, F(x, y)) in part (e). 


[Hint: Given 


F(x,0) =G(x) 
F(x,y+1)=H(x,y,F(x,y)) 


let F,(x, y) = p(x, F(x, y)). Then x = p,(F,(x, y)) and F(x, y) = po(F,(%, y)). 
Define F,(x, y) by a permissible recursion.] 


g. Taking x + y,x- y, and [Vx] as additional initial functions, we can 
limit uses of the recursion rule to the form 


f(x,0)=x 
f(x,y +1) =hty, f(x,y) 
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Hint: Given 


F(x,0) = G(x) 
F(x,y +1) =h(y, F(x,y)) 


define f as above. Then f(x, y) = f(G(x), y). 


h. Taking x+y,x- y [Vx ] and x~y as additional initial functions, we 
can limit uses of the recursion rule to those of the form 


g(0) =0 
oy +1) = Aly, g(y)) 


[Hint: First note that| x — y |= (x= y)+(y=x)and that [Vx is definable 
by a suitable recursion. Now, given 


f(x,0)=x 
f(x,y =hly, f(x,y) 


let g(x) = flp2(x), p(x). Then 


g(0) = — f(p2(0),p:(0)) = (0,0) =0 
g(x+1) = f(p2(x+1),pi(x+1)) 
P2(x +1) if p,(x+1)=0 
vee f(px(x+1),pi(x+1)=1)) if pi(x+1) 40 
P2(x +1) if pi(x+1)=0 
- eee peree if p(x+1) 40 
P2(x +1) if pi(x+1)=0 
7 eae if p(x +1) 40 
= p2(x +1)-sg(pi(x +1) + h(pi(x), 9(x))-sg(pi(x +1) 
= H(x, 9(x)) 


Then f(x, y) = 9(p (y, x). (Note that sg is obtainable by a recursion of 
the appropriate form and sg(x) = 1- x.) 


i. In part (h), H(y, g(y)) can be replaced by H(g(y)). 
[Hint: Given 


g(0) =0 
gly +1) = Aly, g(y)) 
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let fu) = p U, g&) and PW) = p (P:@) + 1, H(p,@), p2(w))). Then 


f(0)=0 
fy+)=o(f(y)) 


and g(u) = p,(f(u)). (Note that sg(x) is given by a recursion of the 
specified form.) 


j. Show that the equations 


w(x,0)=x+1 
w(0,y+)=y(Ly) 
wixt+Ly+D=wiy(x,yt)),y) 


define a number-theoretic function. In addition, prove: 


I. wx, y) > x. 
IL w(x, y) is monotonic in x, that is, if x < z, then w(x, y) < w(z, y). 
TI. wx +1, y) < wx, y +1). 
IV. w(x, y) is monotonic in y, that is, if y < z, then w(x, y) < w(x, 2). 
V.> Use the recursion theorem to show that yw is recursive. [Hint: 
Use Exercise 5.21 to show that there is a partial recursive 
function g such that 9(x,0,u) =x +1,9(0,y+1,u) =9;(1,y),and 
(x +1,y +1,u) = 9;(@; (x,y +1),y). Then use the recursion the- 
orem to find e such that g(x, y,e) = 92(x, y). By induction, show 
that g(x, y, e) = w(x, y).] 
VI. For every primitive recursive function f(x,, ..., x,), there is 
some fixed m such that 


f (1, -.0, Xn) < w(max(X41, ..., Xn),M) 


for all x,, ..., x, [Hint: Prove this first for the initial functions 
Z,N,U!,x+y,xxy,[Vx] and x~y, and then show that it is 
preserved by substitution and the recursion of part (i).] Hence, 
for every primitive recursive function f(x), there is some m 
such that f(x) < w(x, m) for all x. 


VII. Prove that w(x, x) + 1 is recursive but not primitive recursive. 
For other proofs of the existence of recursive functions that 
are not primitive recursive, see Ackermann (1928), Péter (1935, 
1967), and R.M. Robinson (1948). 


A set of natural numbers is said to be recursively enumerable (r.e.) if and 
only if it is either empty or the range of a recursive function. If we accept 
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Church's Thesis, a collection of natural numbers is recursively enumerable 
if and only if it is empty or it is generated by some mechanical process or 
effective procedure. 


Proposition 5.21 


a. 


b. 


c. 
d. 
e. 


A set B is re. if and only if x € B is expressible in the form (Ay)R(x, y), 
where R is recursive. (We even can allow R here to be primitive 
recursive.) 

Bis re. if and only if B is either empty or the range of a partial recursive 
function." 

B is re. if and only if B is the domain of a partial recursive function. 

B is recursive if and only if B and its complement B are r.e+ 

The set K = {x|(Ay) T,(x, x, y)} is re. but not recursive. 


Proof 


a. 


Assume B is re. If Bis empty, thenx € Bo Gy)(x4AxAy Fy). If B is non- 
empty, then B is the range of a recursive function g. Then x € B © (Ay) 
(g(y) = x). Conversely, assume x € B © (Ay)R(x, y), where R is recursive. 
If B is empty, then B is re. If B is nonempty, then let k be a fixed element 
of B. Define 


‘ if SR((Z)o,(Z)1) 
0(z) = ; 
(Z)o if R((zZ)o,(Z):) 


8 is recursive by Proposition 3.19. Clearly, B is the range of 0. (We 
can take R to be primitive recursive, since, if R is recursive, then, by 


Corollary 5.12(a), (Ay)R(, y) + GyT,e, x, y) for some e, and T,(e, x, y) is 
primitive recursive.) 


. Assume B is the range of a partial recursive function g. If B is empty, 


then B is re. If B is nonempty, then let k be a fixed element of B. By 
Corollary 5.11, there is a number e such that 9(x) = U(py T,@, x, y)). Let 


He) = U((z):) — if Tr(e,(z)o,(Z)1) 
~ k if AT, (e,(Z)o,(Z)1) 


By Proposition 3.19, h is primitive recursive. Clearly, B is the range of h. 
Hence, B is r.e. 


* Since the empty function is partial recursive and has the empty set as its range, the condition 
that B is empty can be omitted. 
+ B=@-B, where o is the set of natural numbers. 
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c. 


e. 


Assume B is r.e. If B is empty, then B is the domain of the partial recur- 
sive function py(x + y + 1 = 0). If B is nonempty, then B is the range of a 
recursive function g. Let G be the partial recursive function such that 
G(y) = px(g(x) = y). Then B is the domain of G. Conversely, assume B is 
the domain of a partial recursive function H. Then there is a number e 
such that H(x) = U(py Tie, x, y)). Hence, x € B if and only if (Ay)T(e, x, y). 
Since Te, x, y) is recursive, B is r.e. by part (a). 


. Use part (a) and Proposition 5.17(d). (The intuitive meaning of part (d) is 


the following: if there are mechanical procedures for generating B and 
B, then to determine whether any number n is in B we need only wait 
until 1 is generated by one of the procedures and then observe which 
procedure produced it.) 

Use parts (a) and (d) and Corollary 5.13(a). 


Remember that the functions @;,(x) = U(uyT,(n,x, y)) form an enumeration 
of all partial recursive functions of one variable. If we designate the domain 


of o;, by W 


then Proposition 5.21(c) tells us that Wy, W,, W,, ... is an enu- 


nv 


meration (with repetitions) of all re. sets. The number n is called the index of 
the set W,,. 


Exercises 


5.41 


5.42 


5.43 


5.44 


5.45 
5.46 


Prove that a set B is re. if and only if it is either empty or the range of a 

primitive recursive function. [Hint: See the proof of Proposition 5.21(b).] 

a. Prove that the inverse image of a re. set B under a partial recursive 
function f is re. (that is, {x| f(x) € B} is r.e.). 

b. Prove that the inverse image of a recursive set under a recursive 
function is recursive. 

c. Prove that the image of a re. set under a partial recursive function 
is re. 

d. Using Church's thesis, give intuitive arguments for the results in 
parts (a)-(c). 

e. Show that the image of a recursive set under a recursive function 
need not be recursive. 

Prove that an infinite set is recursive if and only if it is the range of 

a strictly increasing recursive function. (g is strictly increasing if x < y 

implies g(x) < g(y).) 

Prove that an infinite set is re. if and only if it is the range of a one-one 

recursive function. 

Prove that every infinite re. set contains an infinite recursive subset. 

Assume that A and B are re. sets. 


a. Prove that A UB isr.e. [In fact, show that there is a recursive function 
glu, v) such that W,, W,, U W,,] 


u,v) = 
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b. Prove that A n Bis r.e. [In fact, show that there is a recursive function 
h(u, v) such that Wj,,,.) = W,, a W,] 
c. Show that A need not be re. 
d. Prove that UneaW, is re. 
5.47 Show that the assertion 


(V) A set B is re. if and only if B is effectively enumerable (that is, there 
is a mechanical procedure for generating the numbers in B) is equiva- 
lent to Church’s thesis. 


5.48 Prove that the set A = {u|W,, = o} is not re. 


5.49 A set Bis called creative if and only if B is re. and there is a partial recur- 
sive function ft such that, for any n, if W, ¢ B, then h(n) € B-W,. 


a. Prove that {x|(Ay)T,(x, x, y)} is creative. 
b. Show that every creative set is nonrecursive. 


5.50°A set B is called simple if B is re., B is infinite, and B contains no infinite 
re. set. Clearly, every simple set is nonrecursive. Show that a simple set 
exists. 


5.51 A recursive permutation is a one-one recursive function from @ onto @. 
Sets A and B are called isomorphic (written A ~ B) if there is a recursive 
permutation that maps A onto B. 


a. Prove that the recursive permutations form a group under the oper- 
ation of composition. 


b. Prove that ~ is an equivalence relation. 


Prove that, if A is recursive (re., creative, simple) and A ~ B, then 
B is recursive (r.e., creative, simple). 


Myhill (1955) proved that any two creative sets are isomorphic. (See 
also Bernays, 1957) 


5.52 A is many—one reducible to B (written AR,,B) if there is a recursive 
function f such that u € A if and only if fu) € B. (Many-one reduc- 
ibility of A to B implies that, if the decision problem for membership 
in B is recursively solvable, so is the decision problem for member- 
ship in A.) A and B are called many-—one equivalent (written A = ,,B) 
if AR,,B and BR,,A. A is one-one reducible to B (written AR,B) if there 
is a one-one recursive function f such that u € A if and only if f(u) € 
B. A and B are called one-one equivalent (written A =, B) if AR,B 
and BR,A. 


a. Prove that =,, and =, are equivalence relations. 


b. Prove that, if A is creative, B is r.e., and AR,,B, then B is creative. 
[Myhill (1955) showed that, if A is creative and B is re., then 
BR,,A.] 
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5.53 


5.54 


c.  (Myhill, 1955) Prove that, if AR,B then AR,,,B, andif A = ,B then A =,,,B. 
However, many-one reducibility does not imply one-one reducibil- 
ity, and many-one equivalence does not imply one-one equivalence. 
[Hint: Let A be a simple set, C an infinite recursive subset of A, and 
B=A-C. Then AR,B and BR,,,A but not-(BR,A).] It can be shown that 
A=,B ifand only if A ~ B. 


(Dekker, 1955) A is said to be productive if there is a partial recur- 
sive function f such that, if W,, ¢ A, then f(n) € A — W,,. Prove the 
following. 


a. IfA is productive, then A is not re; hence, both A and Aare infinite. 


b.? If A is productive, then A has an infinite r.e. subset. Hence, if A is 
productive, A is not simple. 


c. If Aisre., then A is creative if and only if A is productive. 
d.? There exist 2“° productive sets. 


(Dekker and Myhill, 1960) A is recursively equivalent to B (written A ~ B) 
if there is a one-one partial recursive function that maps A onto B. 


a. Prove that ~ is an equivalence relation. 


b. A is said to be immune if A is infinite and A has no infinite re. 
subset. A is said to be isolated if A is not recursively equivalent 
to a proper subset of A. (The isolated sets may be considered the 
counterparts of the Dedekind-finite sets.) Prove that an infinite set 
is isolated if and only if it is immune. 


c.2 Prove that there exist 2° immune sets. 
Recursively enumerable sets play an important role in logic 
because, if we assume Church's thesis, the set T, of Gédel num- 
bers of the theorems of any axiomatizable first-order theory K is 
re. (The same holds true of arbitrary formal axiomatic systems.) In 
fact, the relation (see page 200) 


Pfx(y,x): yistheG del number of a proof in K of a wf with 
Godel number x 


is recursive if the set of Godel numbers of the axioms is recursive, 
that is, if there is a decision procedure for axiomhood and Church's 
thesis holds. Now, x € T, if and only if (Ay)Pf,(y, x) and, therefore, 
Tx is re. Thus, if we accept Church’s thesis, K is decidable if and 
only if the re. set T, is recursive. It was shown in Corollary 3.46 
that every consistent extension K of the theory RR is recursively 
undecidable, that is, T, is not recursive. 
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Much more general results along these lines can be proved (see 
Smullyan, 1961; Feferman, 1957; Putnam, 1957; Ehrenfeucht and 
Feferman, 1960; and Myhill, 1955). For example, if K is a first-order 
theory with equality in the language , of arithmetic: (1) if every 
recursive set is expressible in K, then K is essentially recursively 
undecidable, that is, for every consistent extension K’ of K, T,’ is 
not recursive (see Exercise 5.58); (2) if every recursive function is 
representable in K and K satisfies conditions 4 and 5 on page 210, 
then the set T, is creative. For further study of re. sets, see Post 
(1944) and Rogers (1967); for the relationship between logic and 
recursion theory, see Yasuhara (1971) and Monk (1976, part III). 


Exercises 


5.55 


5.57 


5.58 


Let K be a first-order theory with equality in the language ~, of arith- 

metic. A number-theoretic relation B(x, ..., x,) is said to be weakly 

expressible in K if there is a wf A(x, ..., X,) of K such that, for any natural 
numbers k,, ..., k,, B(ky, ...,k,) if and only if kk (ki, ..., kn). 

a. Show that, if K is consistent, then every relation expressible in K is 
weakly expressible in K. 

b. Prove that, if every recursive relation is expressible in K and K is 
w-consistent, every re. set is weakly expressible in K. (Recall that, when 
we refer here to a re. set B, we mean the corresponding relation “x € B.”) 

c. If K has a recursive vocabulary and a recursive axiom set, prove 
that any set that is weakly expressible in K is re. 

d. If formal number theory S is w-consistent, prove that a set B is re. if 
and only if B is weakly expressible in S. 

a. (Craig, 1953) Let K be a first-order theory such that the set Ty of 
Godel numbers of theorems of K is r.e. Show that, if K has a recur- 
sive vocabulary, K is recursively axiomatizable. 

b. Forany wf.7of formal number theory S, let .7# represent its translation 
into axiomatic set theory NBG (see page 276). Prove that the set of wfs 

zsuch that Fuge .7# is a (proper) recursively axiomatizable extension 
of S. (However, no “natural” set of axioms for this theory is known.) 

Given a set A of natural numbers, let u € A* if and only if u is a Gédel 

number of a wf .A(x,) and the Gédel number of .7(i/). is in A. Prove that, 

if A is recursive, then A* is recursive. 

Let K be a consistent theory in the language », of arithmetic. 

a. Prove that (Tx)* is not weakly expressible in K. 


b. If every recursive set is weakly expressible in K, show that K is 
recursively undecidable. 

c. If every recursive set is expressible in K, prove that K is essentially 
recursively undecidable. 


Computability 355 


5.5 Other Notions of Computability 


Computability has been treated here in terms of Turing machines because 
Turing’s definition is probably the one that makes clearest the equivalence 
between the precise mathematical concept and the intuitive notion.* We 
already have encountered other equivalent notions: standard Turing com- 
putability and partial recursiveness. One of the strongest arguments for the 
rightness of Turing’s definition is that all of the many definitions that have 
been proposed have turned out to be equivalent. We shall present several of 
these other definitions. 


5.5.1 Herbrand—Godel Computability 


The idea of defining computable functions in terms of fairly simple systems 
of equations was proposed by Herbrand, given a more precise form by Gédel 
(1934), and developed in detail by Kleene (1936a). The exposition given here 
is a version of the presentation in Kleene (1952, Chapter XI.) 

First let us define the terms. 


1. All variables are terms. 

2. Ois a term. 

3. If tis a term, then (f)’ is a term. 

4. If t,,...,f, are terms and f/' is a function letter, then f;"(H, ..., tn) isa term. 


For every natural number n, we define the corresponding numeral n as 
follows: (1) 0 is 0 and (2) n +1is (7). Thus, every numeral is a term. 

An equation is a formula r = s where r and s are terms. A system E of equa- 
tions is a finite sequence 1, = $1, fp = $2, ..., % = 5, of equations such that 7, is 
of the form f;'(t, ..., t1). The function letter f/’ is called the principal letter of 
the system E. Those function letters (if any) that appear only on the right- 
hand side of equations of E are called the initial letters of E; any function 
letter other than the principal letter that appears on the left-hand side of 
some equations and also on the right-hand side of some equations is called 
an auxiliary letter of E. 

We have two rules of inference: 

R,: An equation e, is a consequence of an equation e, by R, if and only 
if e, arises from e, by substituting any numeral n for all occurrences of a 
variable. 


* For further justification of this equivalence, see Turing (1936-1937), Kleene (1952, pp. 317-323, 
376-381), Mendelson (1990), Dershowitz and Gurevich (2008), and the papers in the collection 
Olszewski (2006). The work of Dershowitz and Gurevich, based on a finer analysis of the 
notion of computation, provides much stronger support for Church’s Thesis. 
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R,: An equation e is a consequence by R, of equations f;"(7, ..., 1m) = Pp 
and r = s if and only if e arises from r = s by replacing one occurrence of 
fi'(™, ..., Mm) ins by p, and r = s contains no variables. 

A proof of an equation e from a set B of equations is a sequence éy, ..., €,, of 
equations such that e,, is e and, if 0 <i <n, then: (1) e; is an equation of B, or 
(2) e; is a consequence by R, of a preceding equation e,j < i), or (3) e; is a con- 
sequence by R, of two preceding equations e; and e,,(j < i, m < i). We use the 
notation Bk e to state that there is a proof from B of e (or, in other words, that 
e is derivable from B). 


Example 
Let E be the system 


fi (x1) a(x) 
fe (X4,%2) = =f (2,%2, fi (%1)) 


The principal letter of E is f?, fi is an auxiliary letter, and fi f is an initial 
letter. The sequence of equations 


Fi (21, %2) = fP(2, x2, fi (1) 
Fi(2,%2) = fe (2, x2, fi(2)) 
fi(2, 1) = fr(2,1, fi(2)) 
fi (x1) =(x1) 
fi(2)=(2) (Le. fi(2) = 3) 
f2Q,1)= f92.1,3) 


is a proof of f?(2,1)= f?(2,1,3) from E. 

A number-theoretic partial function @(%,, ..., x, is said to be computed by a 
system E of equations if and only if the principal letter of E is a letter f/' and, 
for any natural numbers k,, ..., k,, p, 


Eb fi'(ki, ..., kn)=p ifand only if| o(ki, ..., k,)isdefined and (ky, ..., kn) =p | 


The function pis called Herbrand—Gédel-computable (for short, HG-computable) 
if and only if there is a system E of equations by which @ is computed. 


Examples 


1. Let E be the system f/(x,)=0. Then E computes the zero function Z. 
Hence, Z is HG-computable. 
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2. Let E be the system fi'(x:) = (x1). Then E computes the successor func- 
tion N. Hence, N is HG-computable. 


3. Let E be the system f/"(X1, ..., X,) =%;. Then E computes the projection 
function U;’. Hence, U;' is HG-computable. 


4. Let E be the system 


A (1,0) =X 
FT (01, (%2)) = (fF (1, %2))! 


Then E computes the addition function. 
5. Let E be the system 


film) =0 
fi (x1) =X 


The function (x,) computed by E is the partial function with domain {0} 
such that ¢(0) = 0. For every k #0,Et fi(k)=Oand Et fi'(k) =k. Hence, @(x,) 
is not defined for x, # 0. 


Exercises 


5.59 a. 


5.60 a. 


What functions are HG-computable by the following systems of 
equations ? 


i. fi (0) =0, fi((x1)’) =X, 

ii, fP(%1,0)= 1, 20, %2)=0, P(X)’, (22)) = fr (1X2) 

iii, fi(m)=0, fim) =0' 

iv. fi (41,0) =x, FE (x1, (%2)) = (7 (211, %2) J, fi (x1) = fr (1,21) 


Show that the following functions are HG-computable. 


i |x, -X, | 

li, = X,-°X, 

_ O if xiseven 
ie ed . if xis odd 


Find a system E of equations that computes the n-place function that 
is nowhere defined. 

Let f be an n-place function defined on a finite domain. Find a sys- 
tem of equations that computes f. 

If f(x) is an HG-computable total function and g(x) is a partial func- 
tion that coincides with f(x) except on a finite set A, where g is unde- 
fined, find a system of equations that computes g. 
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Proposition 5.22 


Every partial recursive function is HG-computable. 


Proof 


a. 


b. 


io) 


Examples 1-3 above show that the initial functions Z, N, and U7’ are 
HG-computable. 
(Substitution rule (IV).) Let @(x, ..., X,) = Q(t, .., Xy)y vee 
WilX1, ++, X,)) Where n, Wy, .--, W,, have been shown to be 
HG-computable. Let E; be a system of equations computing w;, with 
principal letter f;", and let E,,,, be a system of equations computing 
n, with principal letter afer By changing indices we may assume 
that no two of E,, ..., E,,,, have any Sega letters in common. 
Construct a system E for @ by listing E,, ..., E,,,, and then adding 
the: equation fraps, Xn) = fr (My «sen Kp) fey fp ty «<x Mp) 
(We may assume that fj:12 does not occur in E,, ..., E,,,;.) It is clear 
that, if @(k, ..., e = p, then EF firo(ki,...,kn) =p. Conversely, 
if ree BG kiepey chen. Er fi Giese) Ss pip aE ta 
(ky, 06) kn)=Pm and EF fiia(pi,..., Pm)=p- Hence, it readily 
follows that EF, F ff (ei xoth ok n= Digiess; En F fin(ki, oy kn) = Pin and 
Enabfne(Pi ---, Pm=p.Consequently,wilkr, ...,kn)=Pry +. Win, «6 Kn) = Pn 
and (py, ---, Pm) = p- So, P(ky, ..., k,) = p. [Hints as to the details of 
the proof may be found in Kleene (1952, Chapter XI, especially, pp. 
262-270).] Hence, @ is HG-computable. 


. (Recursion rule (V).) Let 


O(X1, -.-, Xn,O) = W(X, ..-, Xn) 


(X41, veey Xp Xn41 +1) > (x41, seer Xn+1, O(X1, vey Xn+1)) 


where yw and $ are HG-computable. Assume that E, is a system of equa- 
tions computing yw with principal letter f;' and that E, is a system of 
equations computing 9 with principal letter f;""?. Then form a system 
for computing » by adding to E, and E, 


(ey tee Xn,0) = fi (*1, tees Xn) 
aa ester Sy Oa) = saa CoP ay 2 Xow fr "ei, soos Xna1)) 


(We assume that E, and E, have no function letters in common.) Clearly, 
if p(y, «ky R= p, then EF ff'""(ky, ..., Kuk) =D. Converse. one can 
prove easily by induction on k that, if EP Al '(Ki, 1 Kuk) =p, then 
p(k, ..., k,, k) = p. Therefore, @ is HG-computable. aoe case when the 
recursion has no parameters is even easier to handle.) 
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d. (-operator rule (VI).) Let p(x, ..., X,,) = HY, ..., X,, y) = 0) and assume 
that y is HG-computable by a system E, of equations with principal 
letter f/'*’. By parts (a)—(0), we know that every primitive recursive func- 
tion is HG-computable. In particular, multiplication is HG-computable; 
hence, there is a system E, of equations having no function letters in 
common with E, and with principal letter f7 such that E, + f?(k1,ko) = p 
if and only if k, -k, =p. We form a system E; by adding to E, and E, the 
equations 


ga were oe tl 


Sie Mwai Geel” Cie eo eo) 


One can prove by induction that E; computes the function [Ty<z w(%1, «.., X,Y); 
that is, E; ar tas C2 ky, k) =p if and only if I], w(ki, ..., kn, y) =p. Now 
construct the system E by adding to E; the equations 


ee = 
fs (a, toes “7 CG, coy Xn Xnva)y Fs Gy toes Xn(Xns1)'), Xnv1) 


Then E computes the function @(%,, ..., %,) = by(W(Xp -.., XY) = 0). Tf wy(w (ky ..., 
ky Y) =0) =q, then Es Ee ees +, Kn, 9) =’, where p+1= ae wk, wer Kus), 
and E;F f3;'* “nica q)=0. Hence, E fs (ki, .., )= fr (p, 0,q). But, 
Ef? (p', 0, q)= 4, and so, Etf3' (ki, vee Kn )=Q- Conversely, . E Hfs (ki, ..- kn) =9, 
then Ef? (m7, 0, a qd, where Ba Key in Kup QS ORY and 
Estfs (ki, ...kn, q’)=0. Hence,  [ycq w(kt,..-,kn, y)=m+140 and 
Tl ycga1 W(K1, «--+ Kay Y) = i So, wk, ...,k,, y) #0 for y <q, and w(k,, ...,k,, q) =0 
Thus, py(w(k, ...,k,, y) = 0) = q. Therefore, @ is HG-computable. 

We now shall proceed to show that every HG-computable function is par- 
tial recursive by means of an arithmetization of the apparatus of Herbrand— 
Gédel computability. We shall employ the arithmetization used for first-order 
theories in Section 3.4, except for the following changes. The Gédel number 
G() is taken to be 17. The only individual constant 0 is assigned the Gédel 
number 19, that is, g(0) = 19. The only predicate letter “=” is identified with 
“Ai” and is assigned the Gédel number 99 (as in Section 3.4). Thus, an equa- 
tion “r = s” is an abbreviation for “Aj(r, s)”. The following relations and func- 
tions are primitive recursive (compare Propositions 3.26-3.27). 

FL(x): x is the G6del number of the function letter 


— 
L 


Ay) yex(AZ) 2<x(X = 14+8(2’ "JAY >O0Az> 0) 


EVbI(x): x is the Géddel number of an expression consisting of a variable 
EFL(x): x is the Godel number of an expression consisting of a function letter 
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Num(x): the Gddel number of the numeral x 
Num(0) = 19 
Num(y + 1) = 23 * Num(y) « 25 « 2!” 


Trm(x): x is the G6del number of a term 
Nu(x): x is the G6del number of a numeral 


(Ay)y<x(x = Num(y)) 


Arg,(x) = number of arguments of a function letter, f, if x is the Godel number 
of f 
x * y = the Gédel number of an expression AB if x is the G6del number of the 
expression A and y is the Gédel number of B 
Subst(x, y, u, v): v is the Gédel number of a variable x;,, u is the Gddel number 
of a term t, y is the Gédel number of an expression .4, and x is 
the Gédel number of the result of substituting ¢ for all occur- 
rences of x; in .7 
The following are also primitive recursive: 
Eqt(x): x is the Gddel number of an equation: 


(AU) y<x (AW) wex(Trm(u) A Trm(w) A x = 2° * 2? #2" 427 #2” *2°) 


(Remember that = is A?, whose Godel number is 99.) 
Syst(x): x is the Gddel number of a system of equations: 


(VY) y<x¢x) Eqt((x)y) A (AU) u<x(AW) wex [EFL(w) A Trm(w) AX 4(x)+1 
= 2% +2 x we?’ *y*2?] 


Occ(u, v): u is the G6del number of a term f or equation .7and v is the Gédel 
number of a term that occurs in t or 7 


(Trm(u) v Eqt(u)) A Trm(o) A (3x) sau(Sy)yeu( = x80" y 


VU=X*UVU=V*YVU=D) 


Cons,(u, v): u is the Gddel number of an equation e,, v is the Gédel number of 
an equation é,, and e, is a consequence of e,; by rule R;: 


Eqt(u) A Eqt(v) A (AX) x <u(Sy)y<o(Nu(y) a Subst(v, u, y, x) A Oce(u, x)) 
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Cons,(u, Z, v): u, z, v are G6del numbers of equations 1, €,, €3, respectively, and 
e; is a consequence of e, and e, by rule R;: 


Eqt(u) a Eqt(z). A Eqt(v) A A(x) +<z(EVb1(x) A Occ(z, x)) 
AA(AX) x<u(EVbI(x) A Occ(u, x)) A (AP) peu (Ab) rcu(AX) xu (FY) yu 
(4q),<ulq = QPS) at Ay = 2% +23 xg42” *Num(p)*2° A 
(40) .<-(4B)g<.(Trm(a) A Trm(B) A z = 2 #2? #0 #2’ *B#2° A 
{[B=qA0=2” +2? #2’ *Num(p)*2)]v (Sy)ye(A8) sez 
[(B=y*qavu=2” *2?*a*2’ *y*Num(p)*2°)v 
(B=q*5A0=2” *2>*a*2’ *qxNum(p)*2°)v 
(B=y*qe5Av=2"*2)* 0,42’ *y*Num(p)*8*2°)]}) 


Ded(u, z): u is the Gédel number of a system of equations E and z is the Gédel 
number of a proof from E: 


Syst(u) A (VX) <4z) (AW) <u) (“ew = (Z)x 
V (Ay)y<xConsi((Z)y/(Z)x) V (AY)y<x(4)o<xConsa((Z)y,(Z)o,(Z)x)) 


S,(U, X1, +. Xp Z: U is the Gddel number of a system of equations E whose 
principal letter is of the form f;’, and z is the Gédel number of a proof from E 
of an equation of the form f/'(%1,..., Xn) =P: 


Ded(u,z) A (Aw) <u(SY)yeul()(uy-1 = 2 *2° * 91+82"3") y oe 
(At) reu(Nu(t) A(z) 221 = 2? #29 #2823) 423 « Num(x1) #27 * 
Num(x2)* 2” *---*Num(x,)*2° *£)] 


Remember that g(() = 3, g()) = 5, and g(,) =7. 


U(x) = bY yex (FW) vex ((X) (x1 = 2” #2? *w*2/ *Num(y)*2”) 


(The significance of this formula is that, if x is the Gédel number of a proof 
of an equation r = p, then U(x) = p.) 


Proposition 5.23 


(Kleene, 1936a) If p(x, ..., x,) is HG-computable by a system of equations E 
with Gédel number e, then 


O(X1, eG Xn) = U(uy(S,,(e, x1, tee Xn7¥))) 
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Hence, every HG-computable function @ is partial recursive, and, if ¢ is total, 
then @ is recursive. 


Proof 


p(k, ...,k,) =pifand only if EF fit, ..., ky) = Bp, where fj is the principal let- 
ter of E. p(k, ...,k,) is defined if and only if (Ay)S,,(e, k, ..., k,, y). If p(k, ..., k,) 
is defined, py(S,,¢, ky, ..., ky, y)) is the Gddel number of a proof from E of an 
equation fj'(k1, ..., kn) =p. Hence, U(py(S,(e, ky, .. ky, y) = p = OKy «1 Ky): 
Also, since S,, is primitive recursive, 1y(S,,(@, X1, ..., X,, y)) is partial recursive. 
If » is total, then (Vx) ... (Vx,)(4y)S,,(e, x1, ---, Xp Y); hence, py(S,,(e, x1, ---, Xv Y) 
is recursive, and then, so is U(py(S,,(e, x1, ..., X, Y)))- 

Thus, the class of HG-computable functions is identical with the class of 
partial recursive functions. This is further evidence for Church's thesis. 


5.5.2 Markov Algorithms 


By an algorithm in an alphabet A we mean a computable function 2 whose 
domain is a subset of the set of words of A and the values of which are 
also words in A. If P is a word in A, 2% is said to be applicable to P if P is in 
the domain of 2; if 20 is applicable to P, we denote its value by A(P). By an 
algorithm over an alphabet A we mean an algorithm 2 in an extension B 
of A.* Of course, the notion of algorithm is as hazy as that of computable 
function. 

Most familiar algorithms can be broken down into a few simple steps. 
Starting from this observation and following Markov (1954), we select a par- 
ticularly simple operation, substitution of one word for another, as the basic 
unit from which algorithms are to be constructed. To this end, if P and Q are 
words of an alphabet A, then we call the expressions P > Q and P > -Q pro- 
ductions in the alphabet A. We assume that “>” and “” are not symbols of A. 
Notice that P or Q is permitted to be the empty word. P > Q is called a simple 
production, whereas P > -Q is a terminal production. Let us use P > ()Q to 
denote either P > Q or P > -Q. A finite list of productions in A 


P, > ()Qi 
P, > ()Qz 
P, = ()Q, 


is called an algorithm schema and determines the following algorithm 2 in A. 
As a preliminary definition, we say that a word T occurs in a word Q if there 
are words U, V (either one possibly the empty word A) such that Q = UTV. 


* An alphabet B is an extension of A if A € B. 
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Now, given a word P in A: (1) We write 2&: P a if none of the words P,, ..., P, 
occurs in P. (2) Otherwise, if m is the least integer, with 1 < m < 1, such that 
P,, occurs in P, and if R is the word that results from replacing the leftmost 
occurrence of P,, in P by Q,,, then we write 


a. W:PER 
if P,, > ()Q,, is simple (and we say that 2% simply transforms P into R); 
b. %:Pt-R 


if P,, ~ ()Q,, is terminal (and we say that 2 terminally transforms P into R). 
We then define 2f. P F R to mean that there is a sequence Ry, R,, ..., Ry such 
that 


i. P=R,. 
ii R=R,. 
iii. ForO <j <k— 2, A: RUF Ry. 
iv. Either 2: R,_, F R, or &: Ry; F - R,. (In the second case, we write 2: 
PE-R.) 


We set 2(P) = Rif and only if either 2: P F - R, or W&: PF R and Y: R a. The 
algorithm thus defined is called a normal algorithm (or a Markov algorithm) 
in the alphabet A. 

The action of 2f can be described as follows: given a word P, we find the 
first production P,, > ()Q,, in the schema such that P,,, occurs in P. We then 
substitute Q,,, for the leftmost occurrence of P,,, in P. Let R, be the new word 
obtained in this way. If P,, > ()Q,, was a terminal production, the process 
stops and the value of the algorithm is R,. If P,, > ()Q,, was simple, then we 
apply the same process to R, as was just applied to P, and so on. If we ever 
obtain a word R; such that 2: R; 4, then the process stops and the value (P) 
is R,. It is possible that the process just described never stops. In that case, 
2 is not applicable to the given word P. 

Our exposition of the theory of normal algorithms will be based on Markov 
(1954). 


Examples 
1. Let A be the alphabet {b, c}. Consider the schema 


bo A 
coc 
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The normal algorithm 2 defined by this schema transforms any word 
that contains at least one occurrence of b into the word obtained by 
erasing the leftmost occurrence of b. 2% transforms the empty word A 
into itself. 2 is not applicable to any nonempty word that does not con- 
tain b. 


. Let A be the alphabet {ag, ay, ..., a,}. Consider the schema 


ay >A 
apoA 


a, oA 
We can abbreviate this schema as follows: 
E>A (Gin A) 


(Whenever we use such abbreviations, the productions intended may 
be listed in any order.) The corresponding normal algorithm trans- 
forms every word into the empty word. For example, 


MW: ayarayazaq | ayaraja3 F apayag apa; azb A and %: A 3. Hence, 


Y(ayara1a3aq) =A. 


3. Let A be an alphabet containing the symbol a,, which we shall abbre- 


viate |. For natural numbers n, we define i inductively as follows: 
0 =|and n+1=7|. Thus, 1 =| |,2=| | |, and so on. The words 71 will be 
called numerals. Now consider the schema A = -|, defining a normal 
algorithm 2. For any word P in A, 2(P) = | P* In particular, for every 
natural number 1, Y(7) =n+1. 


4. Let A be an arbitrary alphabet {ap, a,, ...,a,,}. Given a word P = ajaj, ++ ajy, 


let P=a;, -++@j,a;, be the inverse of P. We seek a normal algorithm %& such 
that 9{(P) =P. Consider the following (abbreviated) algorithm schema 
in the alphabet B = 2& U {a, B}. 


a aarp 
b. BE > &B Gin A) 
c. BparBp 
d. Bo-A 


* To see this, observe that A occurs at the beginning of any word P, since P = AP. 
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e. ang > Ean (E, nin A) 
f. A>oa 


This determines a normal algorithm %& in B. Let P=ajaj, --- aj, 
be any word in A. Then 2%: P - aP by production (f). Then, 
aPF aj, Gajaj, ...aj, F a_apAAajaj, ...aj ... FF a_aj ...aj,Gaj;, all by pro- 
duction (e). Thus, 2f: PF aj; aj... aj, Ajo. Then, by production (f), 2 P F aj. 
Ajs +++ jp Hj, WA;o Iterating this process, we obtain W: PF aay cay W ... Cj 
ajo. Then, by production (f), 26 PF aoa;, 1a; 1 & ... KA; KAjo, and, by pro- 
duction (a), 2: P F Ba, 0a. & ... 0a; HAjo. Applying productions (b) and () 
and finally (d), we arrive at 2: P F -P. Thus, 2f is a normal algorithm over 
A that inverts every word of A* 


Exercises 


5.61 Let A be an alphabet. Describe the action of the normal algorithms 
given by the following schemas. 


a. Let Q bea fixed word in A and let the algorithm schema be: A = - Q. 


b. Let Q bea fixed word in A and let « be a symbol not in A. Let B = 
A U {a}. Consider the schema 


a&—>éa (€&in A) 
a>-Q 


A>a 


c. Let Q bea fixed word in A. Take the schema 
E>A (Ein A) 
A>-Q 

d. Let B=AU {|}. Consider the schema 


§ >| Gin A-{]}) 
A >| 


* The distinction between a normal algorithm in A and a normal algorithm over A is impor- 
tant. A normal algorithm in A uses only symbols of A, whereas a normal algorithm over A 
may use additional symbols not in A. Every normal algorithm in A is a normal algorithm 
over A, but there are algorithms in A that are determined by normal algorithms over A but 
that are not normal algorithms in A (for example, the algorithm of Exercise 5.62(d)). 
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5.63 
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Let A be an alphabet not containing the symbols a, B, y. Let B= A U {a} 

and C=AU {a, B, y}. 

a. Construct a normal algorithm 2% in B such that A&(A) = A and 
(EP) = P for any symbol € in A and any word P in A. Thus, 2 erases 
the first letter of any nonempty word in A. 


b. Construct a normal algorithm D in B such that D(A) = A and 


D(Pé) = P for any symbol € in A and any word P in A. Thus, D 
erases the last letter of any nonempty word in A. 


c. Construct a normal algorithm © in B such that @(P) equals A if P 
contains exactly two occurrences of « and G(P) is defined and is not 
equal to A in all other cases. 


d. Construct anormal algorithm 7 in C such that, for any word P of 
A, 8(P) = PP. 

Let A and B be alphabets and let « be a symbol in neither A nor B. For 

certain symbols a, ..., a, in A, let Q,, ..., Q, be corresponding words 

in B. Consider the algorithm that associates with each word P of A 

the word Sub¢, 6, (P) obtained by simultaneous substitution of each 

Q; for a,Zi = 1, ..., k). Show that this is given by a normal algorithm in 

AUBU {a}. 

Let H = {|} and M ={|, B}. Every natural number n is represented by its 

numeral 1, which is a word in H. We represent every k-tuple (1, 1, ..., 1) 

of natural numbers by the word 1,B/,B... Bn, in M. We shall denote 

this word by (1,1, ..., 1%). For example, (3,1,2) is ||||B||B]||. 

a. Show that the schema 


B > 

al| > a| 
a| > :| 
A > a 


defines a normal algorithm 2, over M such that %7(7) = 0 for 
any n, and 2, is applicable only to numerals in M. 


b. Show that the schema 
BOB 
a|>-| 


A>a 


defines a normal algorithm 2% over M such that Wy (7) =n+1 for 
all n, and %, is applicable only to numerals in M. 
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c. Let a, ..., &, be symbols not in M. Let 1 <j <k. Let .4 be the list 


O2;-1B > O2;-1B 
Ol2j-1 lo Ol) 
2: lo Ol; 


O2;B > Opi+1 


If 1<j<k consider If j=1,consider If j =k, consider 
the algorithm schema the schema the schema 
A 0,B > a,B A 
: Oy lo Q2 : 
7j-1 Qo >| Qo Se-1 
Oj-1B > O2)-1B 2B > 03 O2~-1B > O24_1B 
Ob j-1 lo Qh; 72 Ob 24-1 [> Ox | 
O2; [>| 2; : Ook [>| Ox 
2B > Opj41 4 24B > O,B 
Fiat Oo%-1B > O24-1B Ox >A 
: Ob¢-1 | OLr« | A> Q, 
Ay 012¢B | Ox 
O94¢1B > O14-4B O2.B > O2,B 
OlLop—1 | Cox Or, >A 
Oby4 [> Or5 A>Q, 
Oo~B > Oo.B 
Oo, >A 
A~> 1 


Show that the corresponding normal algorithm 2‘ is such that 
WK((m, ..., M))=N;; and YX" is applicable to only words of the 
form (1, ..., Nk). 

d. Construct a schema for a normal algorithm in M transforming 
(M1,N2) into | my — Ny |. 

e. Construct a normal algorithm in M for addition. 

f. Construct a normal algorithm over M for multiplication. 


Given algorithms 2 and B and a word P, we write A&(P) + B(P) if and only if 
either 2f and B are both applicable to P and 2(P) = B(P) or neither 2% nor B is 
applicable to P. More generally, if C and D are expressions, then C » D is to 
hold if and only if neither C nor D is defined, or both C and D are defined and 
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denote the same object. If 2 and 8 are algorithms over an alphabet A, then we 
say that & and B are fully equivalent relative to A if and only if 2{(P) = B(P) for 
every word P in A; we say that 2 and 8 are equivalent relative to A if and only if, 
for any word P in A, whenever 2(P) or B(P) exists and is in A, then 2(P)  B(P). 

Let M be the alphabet {|, B}, as in Exercise 5.64, and let w be the set of natu- 
ral numbers. Given a partial number-theoretic function @ of k arguments, 
that is, a function from a subset of w‘ into , we denote by 8, the correspond- 
ing function in M; that is, B,((m,..., M)) = @(m1,..., M) whenever either of 
the two sides of the equation is defined. 8, is assumed to be inapplicable 
to words not of the form (m,..., 1). The function @ is said to be Markov- 
computable if and only if there is a normal algorithm 2% over M that has the 
value ~(m,..., 1) when applied to (11, ..., Mk)* 

A normal algorithm is said to be closed if and only if one of the productions in 
its schema has the form A = - Q. Such an algorithm can end only terminally— 
that is, by an application of a terminal production. Given an arbitrary normal 
algorithm 2, add on at the end of the schema for 2 the new production A > - A, 
and denote by 2&- the normal algorithm determined by this enlarged schema. 
A - is closed, and 2 - is fully equivalent to 2 relative to the alphabet of 2. 

Let us now show that the composition of two normal algorithms is again 
a normal algorithm. Let 2% and 8 be normal algorithms in an alphabet A. 
For each symbol b in A, form a new symbol b, called the correlate of b. Let A 
be the alphabet consisting of the correlates of the symbols of A. We assume 
that A and A have no symbols in common. Let « and $ be two symbols not in 
AVA. Let G, be the schema of 2& - except that the terminal dot in terminal 
productions is replaced by a. Let Gy be the schema of B - except that every 
symbol is replaced by its correlate, every terminal dot is replaced by 8, pro- 
ductions of the form A > Q are replaced by a > «Q, and productions A > -Q 
are replaced by « > aBQ. Consider the abbreviated schema 


aa—->aa (ain A) 
aa—>aa (ain A) 
ab—ab (a,b in A) 
apB—>Ba (ain A) 
Ba—>Ba (a in A) 
ab—ab (a,b in A) 
ap: A 

Cp 

Ca 


* In this and in all other definitions in this chapter, the existential quantifier “there is” is meant 
in the ordinary “classical” sense. When we assert that there exists an object of a certain kind, 
we do not necessarily imply that any human being has found or ever will find such an object. 
Thus, a function @ may be Markov-computable without our ever knowing it to be so. 
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This schema determines a normal algorithm © over A such that G(P) = 
B(A(P)) for any word P in A. G is called the composition of A and B and is 
denoted B o 2%. 

Let ¥) be an algorithm in an alphabet A and let B be an extension of A. If we 
take a schema for % and prefix to it the production b > b for each symbol b 
in B — A, then the new schema determines a normal algorithm Y), in B such 
that ¥,(P) + 2(P) for every word P in A, and Y) is not applicable to any word 
in B that contains any symbol of 

B- A. Ys, is fully equivalent to Y) relative to A and is called the propagation 
of ¥) onto B. 

Assume that 2 is anormal algorithm in an alphabet A, and 8 is a normal 
algorithm in an alphabet A,. Let A = A, U A). Let &, and B, be the propaga- 
tions of 2% and B, respectively, onto A. Then the composition & of 2%, and B, 
is called the normal composition of XA and B and is denoted by B © 2. (When 
A, = A,, the normal composition of 2( and 8% is identical with the composition 
of 2% and B; hence the notation B © 2 is unambiguous.) G is a normal algo- 
rithm over A such that G(P) + B(2L(P)) for any word P in A,, and G is appli- 
cable to only those words P of A such that P is a word of A,, 2% is applicable 
to P, and 8% is applicable to 2(P). For a composition of three or more normal 
algorithms, we must use normal compositions, since a composition enlarges 
the alphabet. 


Proposition 5.24 


Let 7 be a Turing machine with alphabet A. Then there is a normal algo- 
rithm 2 over A that is fully equivalent to the Turing algorithm Alg - relative 
to A. 


Proof* 


Let D = ig, ---+ ky, Where qig, ---, Gk, are the internal states of 7 and qk = qo- 
Write the algorithm schema for 2% as follows: Choose a new symbol a and 
start by taking the productions aa; > qya; for all symbols a; in A, followed by 
the production a > «a. Now continue with the algorithm schema in the fol- 
lowing manner. First, for all quadruples q,a;a,q, of ., take the production q,a; 
> q,a,. Second, for each quadruple q,a;Lq, of ., take the productions a,,q,a; > 
q,a,a; for all symbols a,, of A; then take the production 

gai > 4,49a;- Third, for each quadruple qja;Rq, of 7, take the productions 
qjaia, > a;q,a,, for all symbols a,, of A; then take the production qja; > a,q,ap. 
Fourth, write the productions q;, >-A for each internal state qi; of 4 and 
finally take A — a. This schema defines a normal algorithm 2 over A, and it 
is easy to see that, for any word P of A, Alg -(P) = 2(P). 


* This version of the proof is due to Gordon McLean, Jr. 
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Corollary 5.25 


Every Turing-computable function is Markov-computable. 


Proof 


Let f(x,, ..., x,,) be standard Turing-computable by a Turing machine 7 with 
alphabet A 2 {|, B}. (Remember that B is ay and | is a,).) We know that, for 
any natural numbers k,, ..., k,, if f(k,, ..., k,) is not defined, then Alg -is not 
applicable to (k;,..., k,), whereas, if f(k,, ...,k,) is defined, then 


Alg (Ky s+ Kn) © RiKty oer KBE (Ey 0) Kn) Ro 


where R, and R, are (possibly empty) sequences of Bs. Let 8 be a normal 
algorithm over A that is fully equivalent to Alg _ relative to A. Let G be the 
normal algorithm over {|, B} determined by the schema 


aB>a 
a|>B| 


BI>|B 
BB By 


yIPBI 
yB y 
By >-A 
B-A 
A>a 


If R, and R, are possibly empty sequences of Bs, then G, when applied to 
Ri(ky, ..., kn) Bf (ki, ..., kn JR2, will erase R, and R,. Finally, let a") be the 
normal “projection” algorithm defined in Exercise 5.64(c). Then the normal 
composition 7,1 ° G ° Bis a normal algorithm that computes f. 

Let A be any algorithm over an alphabet A = {aj,,..., aj,,}- We can associate 
with % a partial number-theoretic function wy such that wy, (”) = m if and 
only if either 1 is not the Gédel number* of a word of A and m = 0, or n and 
m are Godel numbers of words P and Q of A such that 2U(P) = Q. 


Proposition 5.26 


If 2 is anormal algorithm over A = {a,,, ..., aj,,}, then Wy is partial recursive. 


* Here and below, we use the Gédel numbering of the language of Turing computability given in 
Section 5.3 (page 325). Thus, the Godel number g(a)) of a; is 7 + 47. In particular, g(B) = g(a) = 7 
and g(|) = g(a) = 11. 
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Proof 


We may assume that the symbols of the alphabet of A are of the form a;. 
Given a simple production P > Q, we call 2138®5s@ its index; given a termi- 
nal production P > -Q, we let 273858) be its index. If Py > ()Qz, ..., P. > ()Q, 
is an algorithm schema, we let its index be ghogm oF where k; is the index 
of P; > ()Q;. Let Word(u) be the recursive predicate that holds if and only if u 
is the Gédel number of a finite sequence of symbols of the form a;: 


u#O0alu=l1v(Vz)(z < Ch(u) > (Ay)(y <ua(u), =7+ 4y))] 


Let SI(u) be the recursive predicate that holds when u is the index of a simple 
production: “(u) = 3 A (uy) = 1 A Word((u),) A Word((v),). Similarly, TI(u) is the 
recursive predicate that holds when u is the index of a terminal production: 
AU) = 3 A (U)y = 2 A Word((u),) A Word((u),). Let Ind(u) be the recursive predi- 
cate that holds when u is the index of an algorithm schema: u>1 A (V z)(z < 
/“Au)=>SI(u),) V TI((u),)). Let Lsub(x, y, e) be the recursive predicate that holds 
if and only if e is the index of a production P>()Q and x and y are Gédel 
numbers of words U and V such that P occurs in U, and V is the result of 
substituting Q for the leftmost occurrence of P in U: 


Word(x) A Word(y) a (SI(e) v T1(e)) A (AW) nex (SO) vex (x = U* (e)) *V 
AY =U*(€)2 *0A AAW) wax (AZ) zex(X = W*(€) *ZAW <U)) 


Let Occ(x, y) be the recursive predicate that holds when x and y are Gédel 
numbers of words U and V such that V occurs in U: Word(x) A Word(y) A 
(Av), <,(4z).<, (x = v * y « z). Let End@e, z) be the recursive predicate that holds 
when and only when z is the Gédel number of a word P, and e is the index 
of an algorithm schema defining an algorithm 2 that cannot be applied to 
P (ie., &: P 2): Inde) A Word(Z) A (VW), ¢) tOcc(Z( (@))1). Let SCons@, y, x) be 
the recursive predicate that holds if and only if e is the index of an algorithm 
schema and y and x are Gédel numbers of words V and U such that V arises 
from U by a simple production of the schema: 


Ind(e) A Word(x) A Word(y) A (40) v<incey[S1((e)o.) A Lsub(x, y, (€)0) 
A (VZ)2<y»Oce(x,((e)-)1)] 


Similarly, one defines the recursive predicate TCons(e, y, x), which differs 
from SCons(e, y, x) only in that the production in question is terminal. Let 
Der(e, x, y) be the recursive predicate that is true when and only when e 
is the index of an algorithm schema that determines an algorithm YL, x is 
the Gédel number of a word U,, y is the Gédel number of a sequence of 
words U,, ..., U,(k = 0) such that, for 0<i<k=+1,U;,; arises from U; by a 
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production of the schema, and either 2: Uj.) + Ux or 2: Upe1 HU, and 2: 
U; A (01, if k = 0, just 2: U, a): 


Ind(e) A Word(x) A (VZ)2<y) Word((y).) A(y)o = x 
A(VZ)z<ay)-2SCons(é, (Y)2+1/(Y)z) AL(A(y) = 1A End@e, (y)o)) 
V (Ay) > LA {TCons(e, (Y) <y)-1,(Y) ay+2) V (SConsle, (Y) y)-1, 
(Y) 4y=2) A End(e,(y)4y1))))] 


Let W,(u) be the recursive predicate that holds if and only if u is the Gédel 
number of a word of A: 


uUZOA(U=1V(V2Z)2c (Uz =7 +4) Vo. V(U)2 = 7+ 4;,) 


Let e be the index of an algorithm schema for 2. Now define the partial recur- 
sive function g(x) = py((Wa(x) A Der, x, y)) V 7W,(x)). But wo (x) = (@(X)) agony 
Therefore, Wy is partial recursive. 


Corollary 5.27 


Every Markov-computable function @ is partial recursive. 


Proof 


Let 2 be a normal algorithm over {1, B} such that (k,, ..., k,) = 1 if and only 
if U((ky,..., k,))=1. By Proposition 5.26, the function wy is partial recur- 
sive. Define the recursive function y(x) = (x)+1. If x = Hiiop;', then n = y(x). 
(Remember that a stroke |, which is an abbreviation for a,, has G6del number 
11. So, if x is the Gddel number of the numeral 7, then y(x) = 1.) Recall that 
TR(k,, ...,k,,) is the G6del number of (kj, ..., k,). 

TR is primitive recursive (by Proposition 5.4). Then @ = y O Wy O TR is 
partial recursive. 

The equivalence of Markov computability and Turing computability fol- 
lows from Corollaries 5.25 and 5.27 and the known equivalence of Turing 
computability and partial recursiveness. Many other definitions of comput- 
ability have been given, all of them turning out to be equivalent to Turing 
computability. One of the earliest definitions, A-computability, was developed 
by Church and Kleene as part of the theory of A-conversion (see Church, 1941). 
Its equivalence with the intuitive notion of computability is not immediately 
plausible and gained credence only when A-computability was shown to be 
equivalent to partial recursiveness and Turing computability (see Kleene, 
1936b; Turing, 1937). All reasonable variations of Turing computability seem 
to yield equivalent notions (see Oberschelp, 1958; Fischer, 1965). 
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5.6 Decision Problems 


A class of problems is said to be unsolvable if there is no effective procedure for 
solving each problem in the class. For example, given any polynomial f(x) with 
integral coefficients (for example, 3x5 — 4x4 + 7x? — 13x + 12), is there an integer 
k such that f(k) = 0? We can certainly answer this question for various special 
polynomials, but is there a single general procedure that will solve the prob- 
lem for every polynomial f(x)? (The answer is given below in paragraph 4.) 

If we can arithmetize the formulation of a class of problems and assign 
to each problem a natural number, then this class is unsolvable if and only 
if there is no computable function h such that, if n is the number of a given 
problem, then h(n) yields the solution of the problem. If Church’s thesis is 
assumed, the function / has to be partial recursive, and we then have a more 
accessible mathematical question. 

Davis (1977b) gives an excellent survey of research on unsolvable prob- 
lems. Let us look at a few decision problems, some of which we already have 
solved. 


1. Is astatement form of the propositional calculus a tautology? Truth tables 
provide an easy, effective procedure for answering any such question. 


2. Decidable and undecidable theories. Is there a procedure for determining 
whether an arbitrary wf of a formal system .y is a theorem of ./? If so, 
¥ is called decidable; otherwise, it is undecidable. 


a. The system L of Chapter 1 is decidable. The theorems of L are the 
tautologies, and we can apply the truth table method. 


b. The pure predicate calculus PP and the full predicate calculus PF 
were both shown to be recursively undecidable in Proposition 3.54. 


c. The theory RR and all its consistent extensions (including Peano 
arithmetic S) have been shown to be recursively undecidable in 
Corollary 3.46. 


d. The axiomatic set theory NBG and all its consistent extensions are 
recursively undecidable (see page 273). 


e. Various theories concerning order structures or algebraic structures 
have been shown to be decidable (often by the method of quanti- 
fier elimination). Examples are the theory of unbounded densely 
ordered sets (see page 115 and Langford, 1927), the theory of abe- 
lian groups (Szmielew, 1955), and the theory of real-closed fields 
(Tarski, 1951). For further information, consult Kreisel and Krivine 
(1967, Chapter 4); Chang and Keisler (1973, Chapter 1.5); Monk (1976, 
Chapter 13); Ershov et al. (1965); Rabin (1977); and Baudisch et al. 
(1985). On the other hand, the undecidability of many algebraic the- 
ories can be derived from the results in Chapter 3 (see Tarski et al, 
1953, II.6, II; Monk, 1976, Chapter 16). 
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3. Logical validity. Is a given wf of quantification theory logically valid? By 


Gédel’s completeness theorem (Corollary 2.19), a wf is logically valid if 
and only if it is provable in the full predicate calculus PF. Since PF is 
recursively undecidable (Proposition 3.54), the problem of logical valid- 
ity is recursively unsolvable. 

However, there is a decision procedure for the logical validity of wfs 
of the pure monadic predicate calculus (Exercise 3.59). 

There have been extensive investigations of decision procedures for 
various important subclasses of wfs of the pure predicate calculus; for 
example, the class (V 4 V) of all closed wfs of the form (Vx)(4y)(Vz). A(x, 
y, 2), where .A(x, y, z) contains no quantifiers. See Ackermann (1954), 
Dreben and Goldfarb (1980) and Lewis (1979). 


. Hilbert’s Tenth Problem. If f(x, ..., X,) is a polynomial with integral coeffi- 


cients, are there integers k,, ..., k,, such that f(k,, ...,k,,) = 0? This difficult 
decision problem is known as Hilbert’s tenth problem. 

For one variable, the solution is easy. When 4p, 41, ..., 4, are integers, 
any integer x such that a,x" + +++ + a,X + dy) = 0 must be a divisor of dp. 

Hence, when 4, # 0, we can test each of the finite number of divisors 
of dp. If ay = 0, then x = 0 is a solution. However, there is no analogous 
procedure when the polynomial has more than one variable. It was 
finally shown by Matiyasevich (1970) that there is no decision procedure 
for determining whether a polynomial with integral coefficients has 
a solution consisting of integers. His proof was based in part on some 
earlier work of Davis et al. (1961). The proof ultimately relies on basic 
facts of recursion theory, particularly the existence of a non-recursive 
re. set (Proposition 5.21). An up-to-date exposition may be found in 
Matiyasevich (1993). 


. Word problems. 


Semi-Thue Systems. Let B = {b,, ..., b,} be a finite alphabet. Remember that 
a word of B is a finite sequence of elements of B. Moreover, the empty 
sequence A is considered a word of B. By a production of B we mean an 
ordered pair (u, v), where u and v are words of B. If p = (u, v) is a produc- 
tion of B, and if w and w’ are words of B, we write w >)W’ if w’ arises from 
w by replacing a part u of w by v. (Recall that u is a part of w if there exist 
(possibly empty) words w, and w, such that w = w,uw,.) 


By a semi-Thue system on B we mean a finite set . of productions of B. For 
words w and w’ of B, we write w >, w’ if there is a finite sequence Wg, W,, ..., 
w;, (k > 0) of words of B such that w = wy, w’ = w, and, for 0 <i<k, there isa 
production p of ./such that w; >, wj,;. Observe that w > _,w for any word w 
of B. Moreover, if w, >, w, and w, > ,w;, then w, => _,w3. In addition, if w, >, 
w, and w; >, Wy, then w, w; >, W> Wy. Notice that there is no fixed order in 
which the productions have to be applied and that many different produc- 
tions of ., might be applicable to the same word. 
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By a Thue system we mean a semi-Thue system such that, for every produc- 
tion (u, v), the inverse (v, u) is also a production. Clearly, if ., is a Thue system 
and w > _,w’, then 

w’ =>, w. Hence, >, is an equivalence relation on the set of words of the 
alphabet of .. 


Example 


Let ./* be the Thue system that has alphabet {b} and productions (b°, A) and 
(A, b®). It is easy to see that every word is transformable into b’, b, or A. 

By a semigroup we mean a nonempty set G together with a binary operation 
on G (denoted by the juxtaposition uv of elements u and v) that satisfies the 
associative law x(yz) = (xy)z. An element y such that xy = yx = x for allx inG 
is called an identity element. If an identity element exists, it is unique and is 
denoted 1. 

A Thue system .y on an alphabet B determines a semigroup G with an 
identity element. In fact, for each word w of B, let [w] be the set of all words 
w’ such that w > _,w’. [w] is just the equivalence class of w with respect to =>. 
Let G consist of the sets [w] for all words w of B. If U and V are elements of 
G, choose a word u in U and a word v in V. Let UV stand for the set [uv]. This 
defines an operation on G, since, if u’ is any word in U and v’ is any word in 
V, [uv] = [u’v’]. 


Exercises 


5.65 For the set G determined by the Thue system ., prove: 
a. (UV) W=U(VW) for all members U, V and W of G. 


b. The equivalence class [A] of the empty word A acts as an identity 
element of G. 


5.66 a. Show that a semigroup contains at most one identity element. 
b. Give an example of a semigroup without an identity element. 


A Thue system .y provides what is called a finite presentation of the corre- 
sponding semigroup G. The elements b,, ..., b,, of the alphabet of v are called 
generators, and the productions (u, v) of .y are written in the form of equa- 
tions u = v. These equations are called the relations of the presentation. Thus, 
in the Example above, b is the only generator and b* = A can be taken as the 
only relation. The corresponding semigroup is a cyclic group of order 3. 

If .7 is a semi-Thue or Thue system, the word problem for ./ is the problem 
of determining, for any words w and w’, whether w >, w’. 


Exercises 


5.67 Show that, for the Thue system ./* in the Example, the word problem is 
solvable. 
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5.68 Consider the following Thue system .~. The alphabet is {a, b, c, d} and 
the productions are (ac, A), (ca, A), (bd, A), (db, A), (a3, A), (b?, A), (ab, 
ba), and their inverses. 

a. Show thatc => ,a*andd=_b. 


b. Show that every word of . can be transformed into one of the words 
a, a2, b, ab, a2b, and A. 

c. Show that the word problem for ./ is solvable. [Hint: To show that 
the six words of part (b) cannot be transformed into one another, 
use the cyclic group of order 6 generated by an element g, with 
a=<andb= 2°] 


Proposition 5.30 


(Post, 1947) There exists a Thue system with a recursively unsolvable word 
problem. 


Proof 


Let 7 be a Turing machine with alphabet {ay, a, ..., a,} and internal states 
{qo, Gu +++» Gn} Remember that a tape description is a sequence of symbols 
describing the condition of .7at any given moment; it consists of symbols 
of the alphabet of 7 plus one internal state q;, and q; is not the last symbol 
of the description. “is in state q,, scanning the symbol following q;, and 
the alphabet symbols, read from left to right, constitute the entire tape at 
the given moment. We shall construct a semi-Thue system / that will reflect 
the operation of .7: each action induced by quadruples of 7 will be copied 
by productions of .~ The alphabet of . consists of {ag, ay, «--, An, Gor Gur «++» Gn By 
8, &}. The symbol 6 will be placed at the beginning and end of a tape descrip- 
tion in order to “alert” the semi-Thue system when it is necessary to add an 
extra blank square on the left or right end of the tape. We wish to ensure that, 
if W => W, then BW => BW’. The productions of .yare constructed from 
the quadruples of 7 in the following manner. 


a. If qja;a,q, is a quadruple of 7 let (qja;, a,q,) be a production of .. 

b. If qja;Rq, is a quadruple of ., let (q,ajay, a,q,a,) be a production of .y for 
every a,. In addition, let (qa; B a;, q,a9 B) be a production of .». (This last 
production adds a blank square when /reaches the right end of the 
tape and is ordered to move right.) 

. If ga;Lq, is a quadruple of 7 let (a,q,a;, q,a;a;) be a production of . for 
each a,. In addition, let (B qja;, B q,a9a;) be a production of .v. (This last 
production adds a blank square to the left of the tape when this is 
required.) 

d. If there is no quadruple of beginning with qa; let ., contain the following 

productions: (qja;, 5), (5 ay, 5) for all ay (5 B, §), (a, & §) for all ay, and ( &, &). 


ie) 
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7 stops when it is in a state q;, scanning a symbol a,, such that qa; does not 
begin a quadruple of .~ In such a case, .” would replace qja; in the final tape 
description of .7 by 5. Then 6 proceeds to annihilate all the other symbols to 
its right, including the rightmost B, whereupon it changes to €. € then anni- 
hilates all symbols to its left, including the remaining p. The final result is € 
alone. Hence: 


(C) For any initial tape description a, 7 halts when and only when Bap => ,& 


Now, enlarge ./ to a Thue system .v’ by adding to . the inverses of all the 
productions of .~. Let us show that 


(V) For any initial tape description a of 7 pap >,’ € if and only if Bop > & 
Clearly, if Bop >, & then Bap >,’ € Conversely, assume for the sake of 


contradiction that Bap =>,’ €, but it is not the case that Bap >, &. Consider a 
sequence of words leading from B af to €in.’: 


Bap=wo> >) Wa >) wr =E 


Here, each arrow is intended to indicate a single application of a produc- 
tion. It is clear from the definition of .” that no production of .” applies to € 
alone. Hence, the last step in the sequence w,_,; >,’ must be the result of a 
production of .~. So, w,_, >. & Working backward, let us find the least p such 
that w, > _,& Since we have assumed that it is not true that Bap > ,€, we must 
have p > 0. By the minimality of p, it is not true that w,_, >, w,. Therefore, 
Ww, => ,W,-1- Examination of the productions of ./shows that each of the words 
Wy Wy -.-, W; must contain exactly one of the symbols qo, qu ---+ Gu, 5, or &, 
and that, to such a word, at most one production of ./ is applicable. But, w, is 
transformed into both w,,,; and w,_; by productions of .7. Hence, w,_4 — Wp. 
But, w,., > § Hence, w,_, > ,& contradicting the definition of p. This estab- 
lishes (V). 

Now, let .7be a Turing machine with a recursively unsolvable halting 
problem (Proposition 5.14). Construct the corresponding Thue system .7’ as 
above. Then, by (CL) and (V), for any tape description a, 7 halts if and only if 
Bap =./ & So, if the word problem for .’ were recursively solvable, the halt- 
ing problem for 7 would be recursively solvable. (The function that assigns 
to the Gddel number of « the Gédel number of (Bap, €) is clearly recursive 
under a suitable arithmetization of the symbolism of Turing machines and 
Thue systems.) Thus, .’ has a recursively unsolvable word problem. 

That the word problem is unsolvable even for certain Thue systems on a two- 
element alphabet (semigroups with two generators) was proved by Hall (1949). 


a. Finitely presented groups. A finite presentation of a group consists of a finite 
set of generators ¢,, ..., g, and a finite set of equations W, = W;/, ..., W, 
= W,’ between words of the alphabet B = {g1, ..., Si Pt ae oo hs What 
is really involved here is a Thue system S with alphabet B, produc- 
tions (W,, W,’), ..., (W,, W,’) and their inverses, and all the productions 
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(gigi, A), (gigi, A) and their inverses. The corresponding semigroup 
G is actually a group and is called a finitely presented group. The word 
problem for G (or, rather, for the finite presentation of G) is the word 
problem for the Thue system 


Problems that concern word problems for finitely presented groups are 
generally much more difficult than corresponding problems for finitely 
presented semigroups (Thue systems). The existence of a finitely presented 
group with a recursively unsolvable word problem was proved, indepen- 
dently, by Novikov (1955) and Boone (1959). Other proofs have been given by 
Higman (1961), Britton (1963), and McKenzie and Thompson (1973). (Gee also 
Rotman, 1973.) Results on other decision problems connected with groups 
may be found in Rabin (1958). For corresponding problems in general alge- 
braic systems, consult Evans (1951). 


Appendix A: Second-Order Logic 


Our treatment of quantification theory in Chapter 2 was confined to first- 
order logic; that is, the variables used in quantifiers were only individual 
variables. The axiom systems for formal number theory in Chapter 3 and 
set theory in Chapter 4 also were formulated within first-order languages. 
This restriction brings with it certain advantages and disadvantages, and 
we wish now to see what happens when the restriction is lifted. That will 
mean allowing quantification with respect to predicate and function vari- 
ables. Emphasis will be on second-order logic, since the important differ- 
ences between first-order and higher-order logics already reveal themselves 
at the second-order level. Our treatment will offer only a sketch of the basic 
ideas and results of second-order logic. 

Let LIC be the first-order language in which C is the set of nonlogical 
constants (i.e., individual constants, function letters, and predicate letters). 
Start with the language L1C, and add function variables g; gand predicate 
variables R/, where n and i are any positive integers.* (We shall use g”, h", ... 
to stand for any function variables of n arguments and R", S”, ..., X", ¥", Z” 
to stand for any predicate variables of n arguments; we shall also omit the 
superscript n when the value of n is clear from the context.) Let (u),, stand 
for any sequence of individual variables u,, ..., u,,* and let V(u),, stand for 
the expression (Vu,) ... (Vu,,). Similarly, let (t),, stand for a sequence of terms 
ty, ..., £,. We expand the set of terms by allowing formation of terms g”((f),,), 
where g” is a function variable, and we then expand the set of formulas by 
allowing formation of atomic formulas A; (@;) and R"((t),,) where (f),, is 
any sequence of the newly enlarged set of terms, A; is any predicate letter 
of C, and R" is any n-ary predicate variable. Finally, we expand the set of 
formulas by quantification (Vg") .zand (VR") .7 with respect to function and 
predicate variables. 

Let L2C denote the second-order language obtained in this way. The lan- 
guage L2C will be called a full second-order language. The adjective “full” 
indicates that we allow both function variables and predicate variables and 
that there is no restriction on the arity n of those variables. An example of 
a nonfull second-order language is the second-order monadic predicate 


* We use bold letters to avoid confusion with function letters and predicate letters. Note that 
function letters and predicate letters are supposed to denote specific operations and rela- 
tions, whereas function variables and predicate variables vary over arbitrary operations and 
relations. 

+ In particular, (x),, will stand for x, ..., x, 
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language in which there are no function letters or variables, no predicate let- 
ters, and only monadic predicate variables.* 

It is not necessary to take = as a primitive symbol, since it can be defined 
in the following manner. 


Definitions 


t =ustands for (VR')(R't = R'u) 
g” =h" stands for V(x), (s" (n= bh" (kx) )) 


R" =S" stands for V(x), (R" ((x)n) 8" ((x)n )) 


Standard Second-Order Semantics for L2C 


For a given language L2C, let us start with a first-order interpretation with 
domain D. In the first-order case, we defined satisfaction for the set > of 
denumerable sequences of members of D. Now, instead of X, we use the 
set 2X» of functions s that assign to each individual variable a member of 
D, to each function variable g” some n-ary operation s(g”) on D, and to 
each predicate variable R” some n-ary relation’ s(R") on D. For each such 
s, we extend the denotations determined by s by specifying that, for any 
terms t,, ..., t,, and any function variable g", the denotation s(g"(t,, ..., f,,)) is 
s(g"(s(t,), ..., s(t,)). The first-order definition of satisfaction is extended as 
follows: 


a. For any predicate variable R" and any finite sequence (f),, of terms, s 
satisfies R"((t),,) if and only if (s(t,), ..., s(t,)) € s(R”). 


b. s satisfies (Vg") if and only if s’ satisfies 7 for every s’ in 22 that 
agrees with s except possibly at g”. 


c. s satisfies (VR") .vif and only if s’ satisfies .7 for every s’ in 22 that 
agrees with s except possibly at R". 


The resulting interpretation ./ is called a standard interpretation of the given 
language. 


* Third-order logics are obtained by adding function and predicate letters and variables that 
can have as arguments individual variables, function and predicate letters, and second-order 
function and predicate variables and then allowing quantification with respect to the new 
function and predicate variables. This procedure can be iterated to obtain nth-order logics for 
alln>1. 

+ Ann-ary relation on D is a subset of the set D” of n-tuples of D. When n = 1, an n-ary relation 
is just a subset of D. 
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A formula .7 is said to be true for a standard interpretation 7 (written 
“* #)if Zis satisfied by every sin >>. 7 is false for 7 if no functions in >, 
satisfies 7. 

A formula .7 is said to be standardly valid if .7 is true for all standard 
interpretations. 7 is said to be standardly satisfiable if 7 is satisfied by 
some s in >, in some standard interpretation. A formula 7 is said to be 
a standard logical consequence of a set I of formulas if, for every standard 
interpretation, every s in >), that satisfies every formula inT also satisfies +. 
A formula .7 is said to standardly logically imply a formula 7 if vis a logical 
consequence of {.7’}. 

The basic properties of satisfaction, truth, logical consequence, and logi- 
cal implication that held in the first-order case (see (I)—(XI) on pages 57-60) 
also hold here for their standard versions. In particular, a sentence 
4 is standardly satisfiable if and only if 4 is true for some standard 
interpretation. 

We shall see that second-order languages have much greater expressive 
power than first-order languages. This is true even in the case where the set 
C of nonlogical constants is empty. The corresponding language L2@ will 
be denoted L2 and called the pure full second-order language. Consider the 
following sentence in L2: 


(1) GgvGx)(VRIR(x) AWRY) = R(g(y)))) = (VX)R(x)] 


This sentence is true for a standard interpretation if and only if the domain 
D is finite or denumerable. To see this, consider an operation g and ele- 
ment x given by this sentence. By induction, define the sequence x, g(x), 
G(g(x)), 9(9(g(X))), ..., and let R be the set of objects in this sequence. R is 
finite or denumerable, and (1) tells us that every object in D is in R. Hence, 
D = Rand D is finite or denumerable. Conversely, assume that D is finite or 
denumerable. Let F be a one-one function from D onto @ (when D is denu- 
merable) or onto an initial segment {0, 1, ..., n} of @ (when D is finite).* Let 
x = F\(0) and define an operation g on D in the following manner. When D 
is denumerable, g(u) = F-\(F(u) + 1) for all u in D; when D is finite, let gu) = 
F-\(F(u) + 1) if Fu) <n and g(u) = x if Fu) = n. With this choice of g and x, 
(1) holds. 


Exercise 


A.1 Show that there is no first-order sentence .7 such that .7is true in an 
interpretation if and only if its domain is finite or denumerable. (Hint: 
Use Corollary 2.22.) 


* Remember that the domain of an interpretation is assumed to be nonempty. 
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Let us introduce the abbreviations Y! C X! for (Vu)(Y'(u) > X'@), NonEm 
(X') for (Au)(X\(u)), and Asym (R?, X') for (Wu)(Vo)(X(u) A XV) A Ru, v) > 
AR*(z, u)). Let R* We X! stand for the second-order formula: 


Asym(R’,X')A(VY')(Y' < X' A NonEm(Y') 


=> (Au)(Y'(u) A (Vo)(¥'(0) Av #u => R?*(u,2))) 


Then R* We X! is satisfied by an assignment in a given standard interpre- 
tation if and only if the binary relation assigned to R* well-orders the set 
assigned to X!. (First note that the asymmetry Asym(R?, X’) implies that R? is 
irreflexive on X!. To see that R? is transitive on X1, assume R7(u, v) and R2(v, w). 
Letting Y'= {u, v, wh, we leave it as an exercise to conclude that R?(u, w). To 
show that R? is connected on X!, take any two distinct elements x and y of X? 
and consider Y?= {x, y}.) 

Let Suc(u, v, R*) stand for R*(v, u) A (Vw)-(R2(v, w) A R(w, u)), and let 
First(u, R’) stand for (Vv)(v 4 u > R(u, v)). Consider the following second- 
order formula: 


(2) (AR?)(4X')(R*WexX! a (Vu)X'(u) A (Vu)(AFirst(u, R*) 
=> (A4v)Suc(u,v,R*)) A (Au)(Vv)(v # u > R?(v,))) 


This is true for a standard interpretation if and only if there is a well-ordering 
of the domain in which every element other than the first is a successor and 
there is a last element. But this is equivalent to the domain being finite. 
Hence, (2) is true for a standard interpretation if and only if its domain is 
finite. 


Exercise 


A.2 (a) Show that, for every natural number n, there is a first-order sentence 
the models of which are all interpretations whose domain contains at 
least n elements. (b) Show that, for every positive integer n, there is 
a first-order theory the models of which are all interpretations whose 
domain contains exactly n elements. (c) Show that there is no first-order 
sentence .7 that is true for any interpretation if and only if its domain is 
finite. 


The second-order sentence (1) A =(2) is true for a standard interpretation if 
and only if the domain is denumerable. 


Exercises 


A.3 Show that there is no first-order sentence .7 the models of which are all 
interpretations whose domain is denumerable. 
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A.4 Construct a second-order formula Den(X’) that is satisfied by an 
assignment in a standard interpretation if and only if the set assigned 
to X! is denumerable. 


Second-Order Theories 


We define a second-order theory in a language L2C by adding the following 
new logical axioms and rules to the first-order axioms and rules: 


(B4a) (VR") .7(R") 5.7 (W”, where .7 (W’) arises from .7(R") by replacing 
all free occurrences of R” by W" and W" is free for R” in .7(R"). 
(B4b) (Vg") .7(g") >. (h"), where .7(h") arises from .7 (g") by replacing all 

free occurrences of g” by h" and h" is free for g” in .7(g"). 
(B5a) (VR")(. 47) => (4> (VR") 4), where R" is not free in .7. 
(B5b) (Vg")(.7>7) > (47> (Vg") 4, where g” is not free in 7. 


Comprehension Schema (COMP) 


(AR")(V(x),(R"(X),,) &.2), provided that all free variables of .7 occur in (x), 
and R" is not free in %. 


Function Definition Schema (FUNDEF) 


(WR™)| (W(x) )YR™ ((2)n 9) => Gg")(V)n)R"" (DnB (Gn) 


New Rules 


(Gen2a) (VR") .7 follows from .7 
(Gen2b) (Vg") 7 follows from 7 


Exercises 


A.5 Show that we can prove analogues of the usual equality axioms 
(A6)-(A7) in any second-order theory: 
i. Kt=trAg"=g"AR"=R" 
ii, Ft=s>(4¢¢t)>7(¢8)), where #(¢ s) arises from .7(t, t) by replac- 
ing zero or more occurrences of f by s, provided that s is free for f in 
H(t, t). 
iii, Eg"=h" > (4(g", g”) > 4(g", h”), where .4(g", h”) arises from .4(g", 
g") by replacing zero or more occurrences of g" by h", provided that 
h’ is free for g” in .7(g”, 8”). 
iv. FE R"= 8" > (4(R", R" >7(R’, 8"), where 7(R", 8”) arises from 7 
(R", R”) by replacing zero or more occurrences of R" by S”, provided 
that S” is free for R" in .7(R", R"). 
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A.6 Formulate and prove a second-order analogue of the first-order deduc- 
tion theorem (Proposition 2.5). 


Let PC2 denote the second-order theory in the language L2C without any 
nonlogical axioms. PC2 is called a second-order predicate calculus. 


Proposition A.1 (Soundness) 
Every theorem of PC2 is standardly valid. 


Proof 


That all the logical axioms (except Comp and FunDef) are standardly 
valid and that the rules of inference preserve standard validity follow by 
arguments like those for the analogous first-order properties. The standard 
validity of Comp and FunDef follows by simple set-theoretic arguments. 

We shall see that the converse of Proposition A.1 does not hold. This will 
turn out to be not a consequence of a poor choice of axioms and rules but an 
inherent incompleteness of second-order logic. 

Let us consider the system of natural numbers. No first-order theory will 
have as its models those and only those interpretations that are isomorphic 
to the system of natural numbers.* However, a second-order characterization 
of the natural numbers is possible. Let AR2 be the conjunction of the axioms 
(S1)-(S8) of the theory S of formal arithmetic (see page 154), and the following 
second-order principle of mathematical induction: 


(259) (VR?)[ R*(0) A(Wx)(RMx) > R(x’) > (¥x)R'(x) | 


Notice that, with the help of Comp, all instances of the first-order axiom 
schema (S9) can be derived from (2S9)+ 

For any standard interpretation that is a model of AR2 we can prove the 
following result that justifies inductive definition. 


* Let K be any first-order theory in the language of arithmetic whose axioms are true in the 
system of natural numbers. Add a new individual constant b and the axioms b # 71 for every 
natural number n. The new theory K’ is consistent, since any finite set of its axioms has a 
model in the system of natural numbers. By Proposition 2.17, K has a model, but that model 
cannot be isomorphic to the system of natural numbers, since the object denoted by b cannot 
correspond to a natural number. 

t In AR2, the function letters for addition and multiplication and the associated axioms (S5)- 
(S8) can be omitted. The existence of operations satisfying (S5)—(S8) can then be proved. See 
Mendelson (1973, Sections 2.3 and 2.5). 
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Proposition A.2 (Iteration Theorem) 


Let .7 be a standard interpretation that is a model of AR2, and let D be the 
domain of .% Let c be an element of an arbitrary set W and let g be a singu- 
lary operation of W. Then there is a unique function F from D into W such 
that F(0) = c and (Vx)\(x € D => F(x’) = g(F(x))* 


Proof 


Let ~be the set of all subsets H of D x W such that (1, c) € H and (Vx)(Vw) 
((x, w) € H => (x’, gw)) € H). Note that D x W € ~« Let F be the intersection 
of all sets Hin ~ We leave it to the reader to prove the following assertions: 


a. FE? 

b. Fis a function from D into W. (Hint: Let B be the set of all x in D for 
which there is a unique w in W such that (x, w)e F. By mathematical 
induction, show that B = D.) 

c. FO) =c. 

d. F(x’) = g(F(x)) for all x in D. 


The uniqueness of F can be shown by a simple application of mathematical 
induction. 


Proposition A.3 (Categoricity of AR2) 


Any two standard interpretations 7 and _/* that are models of AR2 are 
isomorphic. 


Proof 


Let D and D* be the domains of .7 and 7 *, 0 and 0* the respective zero 
elements, and f and f* the respective successor operations. By the iteration 
theorem applied to 4 with W = D*, c = 0* and g = f*, we obtain a function F 
from D into D* such that F(0) = 0* and F(f(x)) = f*(F(x)) for any x in D. An easy 
application of mathematical induction in . 7 * shows that every element of D* 
is in the range of F. To show that F is one-one, apply mathematical induction 
in ./ to the set of all x in D such that (V y)[(ye DA y #N>F(X)  Fly)]* 

Let ., consist of the nonlogical constants of formal arithmetic (zero, 
successor, addition, multiplication, equality). Let ./ be the standard 


* In order to avoid cumbersome notation, “0” denotes the interpretation in _/of the individual 
constant “0,” and “x’” denotes the result of the application to the object x of the interpretation 
of the successor function. 

* Details of the proof may be found in Mendelson (1973, Section 2.7). 
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interpretation of L2.:, with the set of natural numbers as its domain and the 
usual interpretations of the nonlogical constants. 


Proposition A.4 


Let 7% be any formula of L2 «v. Then 7 is true in. if and only if AR2 >.7 is 
standardly valid. 


Proof 


Assume AR2 >.7 is standardly valid. So AR2 >.7 is true in./. But AR2 is 
true in. /. Hence, .7is true in. /. Conversely, assume .7 is true in. /. We must 
show that AR2 >. is standardly valid. Assume that AR2 is true in some 
standard interpretation / of L2.. By the categoricity of AR2, 7 is isomor- 
phic to./. Therefore, since .7 is true in./,.7 is true in .~ Thus, AR2 5.7 is 
true in every standard interpretation of L2:/, that is, AR2 >.7 is standardly 
valid. 


Proposition A.5 


a. The set SV of standardly valid formulas of L2., is not effectively 
enumerable. 

b. SV is not recursively enumerable, that is, the set of Godel numbers of 
formulas in SV is not recursively enumerable. 


Proof 


a. Assume that SV is effectively enumerable. Then, by Proposition A4, 
we could effectively enumerate the set ./vof all true formulas of first- 
order arithmetic by running through SV, finding all formulas of the 
form AR2 >4% where .7 is a formula of first-order arithmetic, and 
listing those formulas 7. Then the theory .7 would be decidable, 
since, for any closed formula 7, we could effectively enumerate 7 
until either ~ or its negation appears. By Church's thesis, “7 would 
be recursively decidable, contradicting Corollary 3.46 (since 7 isa 
consistent extension of RR). 

b. This follows from part (a) by Church’s thesis. 


The use of Church’s thesis in the proof could be avoided by a consistent use 
of recursion-theoretic language and results. The same technique as the one 
used in part (a), together with Tarski’s theorem (Corollary 3.44), would show 
the stronger result that the set (of Géddel numbers) of the formulas in SV is 
not arithmetical. 
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Corollary A.6 


The set of all standardly valid formulas is not effectively (or recursively) 
enumerable. 


Proof 


An enumeration of all standardly valid formulas would yield an enumera- 
tion of all standardly valid formulas of L2.,/, since the set of formulas of 
L2./ is decidable (recursively decidable). 


Corollary A.7 


There is no axiomatic formal system whose theorems are the standardly 
valid formulas of L2.. 


Proof 


If there were such an axiom system, we could enumerate the standardly 
valid formulas of L2.:/, contradicting Corollary A.5. 


Proposition A.8 (Incompleteness of Standard Semantics) 


There is no axiomatic formal system whose theorems are all standardly valid 
formulas. 


Proof 


If there were such an axiom system, we could enumerate the set of all stan- 
dardly valid formulas, contradicting Corollary A.6. 

Proposition A.8 sharply distinguishes second-order logic from first-order 
logic, since Gédel’s completeness theorem tells us that there is an axiomatic 
formal system whose theorems are all logically valid first-order formulas. 
Here are some additional important properties enjoyed by first-order theo- 
ries that do not hold for second-order theories: 


I. Every consistent theory has a model. To see that this does not hold 
for second-order logic (with “model” meaning “model in the sense 
of the standard semantics”), add to the theory AR2 a new individ- 
ual constant b. Let 7 be the theory obtained by adding to AR2 the 
set of axioms ) #77 for all natural number n. 7 is consistent. (Any 
proof involves a finite number of the axioms ) #7. AR2 plus any 
finite number of the axioms ) #7 has the standard interpretation as 


388 Appendix A: Second-Order Logic 


a model, with b interpreted as a suitable natural number. So every 
step of the proof would be true in. /. Therefore, a contradiction can- 
not be proved.). But .~has no standard model. (If .7 were such a 
model, AR2 would be true in 4 Hence, .7 would be isomorphic to 

/,and so the domain of .7 would consist of the objects denoted by 
the numerals 71. But this contradicts the requirement that the domain 
of .7 would have to have an object denoted by “b” that would satisfy 
the axioms b # n for all natural numbers n.) 

. The compactness property: a set T' of formulas has a model if and 
only if every finite subset of T has a model. A counterexample is 
furnished by the set of axioms of the theory 7 in (I) earlier. 

III. The upward Skolem—Léwenheim theorem: every theory that has an 
infinite model has models of every infinite cardinality. In second- 
order logic this fails for the theory AR2. By Proposition A.3, all mod- 
els of AR must be denumerable. 

IV. The downward Skolem—Léwenheim theorem: every model ./ of a 
theory has a countable elementary submodel.* In second-order logic, 
a counterexample is furnished by the second-order categorical the- 
ory for the real number system.’ Another argument can be given 
by the following considerations. We can express by the following 
second-order formula .7(Y', X') the assertion that Y! is equinumer- 
ous with the power set of X!: 


I 


— 


(AR?) [(Vx1) (V2) (X21) A X"(x2) A (VY)(X"(Y) > [R°(0, 9) & 
R?(x2,y)]) > X1,=%))A(VW')\(W! CY! => (Ax)(X1(x) A 


(vy)(W"(y)(< R*(x,y))) 


R? correlates with each x in X! the set of all y in Y! such that R(x, y). Now 
consider the following sentence Cont: 


(ax')(aY')(Den(x') (vy) ¥"(y)a.7(¥X’)) 


Then Cont is true in a standard interpretation if and only if the domain of the 
interpretation has the power of the continuum, since the power set of a denu- 
merable set has the power of the continuum. See Shapiro (1991, Section 5.1.2) 


* For a definition of elementary submodel, see Section 2.13. 

+ The axioms are those for an ordered field (see page 97) plus a second-order completeness 
axiom. The latter can be taken to be the assertion that every nonempty subset that is bounded 
above has a least upper bound (or, equivalently, that no Dedekind cut is a gap). For a proof of 
categoricity, see Mendelson (1973, Section 5.4). 
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and Garland (1974) for more information about the definability of cardinal 
numbers in second-order logic. 


Exercises 


A.7 Show that a sentence of pure second-order logic is true in a standard 
interpretation ./ if and only if it is true in any other standard interpre- 
tation whose domain has the same cardinal number as that of ./ 


A.8 a. Show that there is a formula Cont (X’) of pure second-order logic 
that is satisfied by an assignment in an interpretation if and only if 
the set assigned to X' has the power of the continuum. 


b. Find a sentence CH of pure second-order logic that is standardly 
valid if and only if the continuum hypothesis is true.* 


Henkin Semantics for L2C 


In light of the fact that completeness, compactness, and the Skolem-— 
Léwenheim theorems do not hold in second-order logic, it is of some inter- 
est that there is a modification of the semantics for second-order logic that 
removes those drawbacks and restores a completeness property. The funda- 
mental ideas sketched later are due to Henkin (1950). 

Start with a first-order interpretation with domain D. For each positive 
integer n, choose a fixed collection ¥ (n) of n-ary relations on D and a fixed 
collection 7 (n) of n-ary operations on D. Instead of 2, we now use the set 
yy of assignments s in >), such that, for each predicate variable R", s(R") is 
in 7 (n) and, for each function variable g", s(g”) is in .7 (n). The definitions of 
satisfaction and truth are the same as for standard semantics, except that d) 
is replaced by >5'. Such an interpretation will be called a Henkin interpreta- 
tion. Using a Henkin interpretation amounts to restricting the ranges of the 
predicate and function variables. For example, the range of a predicate vari- 
able R! need not be the entire power set .7 (D) of the domain D. In order for 
a Henkin interpretation 7 to serve as an adequate semantic framework, we 
must require that all instances of the comprehension schema and the func- 
tion definition schema are true in 7 . A Henkin interpretation for which this 
condition is met will be called a general model. A formula that is true in all 
general models will be said to be generally valid, and a formula that is satis- 
fied by some assignment in some general model will be said to be generally 
satisfiable. We say that 7 generally implies vif 7 >7is generally valid and that 
Z is generally equivalent to vif .7 ris generally valid. 

A standard interpretation on a domain D determines a corresponding gen- 
eral model in which 7 (n) is the set of all n-ary relations on D and .7 (n) is the 
set of all n-ary operations on D. Such a general model is called a full general 


* We take as the continuum hypothesis the assertion that every subset of the set of real num- 
bers is either finite or denumerable or is equinumerous with the set of all real numbers. 
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model. Standard satisfaction and truth are equivalent to Henkin satisfaction 
and truth for the corresponding full general model. Hence, the following 
statements are obvious. 


Proposition A.9 


a. Every generally valid formula is also standardly valid. 
b. Every standardly satisfiable formula is generally satisfiable. 


We also have the following strengthening of Proposition A.1. 


Proposition A.10 


Every theorem of PC2 is generally valid. 


Proof 


The general validity of (Comp) and (FunDef) follows from the definition of 
a general model. The proofs for the other logical axioms are similar to those 
in the first-order case, as is the verification that general validity is preserved 
by the rules of inference. 


Proposition A.11 (General Second-Order Completeness) 


The theorems of PC2 coincide with the generally valid formulas of L2C. 


Proof 


Let .7 be a generally valid formula of L2C. We must show that .vis a theorem 
of PC2. (It suffices to consider only closed formulas.) Assume, for the sake of 
contradiction, that .7 is not a theorem of PC2. Then, by the deduction theo- 
rem, the theory PC2+{7.7} is consistent. If we could prove that any consistent 
extension of PC2 has a general model, then it would follow that PC2+{7.7 } 
has a general model, contradicting our hypothesis that 7is generally valid. 
Hence, it suffices to establish the following result. 


Henkin’s Lemma 


Every consistent extension .7 of PC2 has a general model. 


Proof 


The strategy is the same as in Henkin’s proof of the fact that every consis- 
tent first-order theory has a model. One first adds enough new individual 
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constants, function letters, and predicate letters to provide “witnesses” for all 
existential sentences. For example, for each sentence (4x) 7 (x), there will be a 
new individual constant b such that (Ax) 7(x) >7(b) can be consistently added 
to the theory. (See Lemma 2.15 for the basic technique.) The same thing is 
done for existential quantifiers (4g”) and (AR"). Let 7* be the consistent exten- 
sion of 7 obtained by adding all such conditionals as axioms. Then, by the 
method of Lindenbaum’s lemma (Lemma 2.14), we inductively extend 7* toa 
maximal consistent theory 7*. A general model ./ of .7can be extracted from 
7*, The domain consists of the constant terms of .7*. The range of the predi- 
cate variables consists of the relations determined by the predicate letters of 
7*, A predicate letter B determines the relation B* such that B*(t),, holds in 
@ if and only if B*(t),, is a theorem of .7*. The range of the function variables 
consists of the operations determined by the function letters of .7*. If fis a 
function letter of .7*, define an operation f* by letting f*((6),) = f((f),). A proof 
by induction shows that, for every sentence , vis true in.” if and only if vis 
a theorem of .7*. In particular, all theorems of .7are true in / 
The compactness property and the Skolem—Léwenheim theorems also 
hold for general models. See Manzano (1996, Chapter IV) or Shapiro (1991) 
for detailed discussions.* 


Corollary A.12 


There are standardly valid formulas that are not generally valid. 


Proof 


By Corollary A.7, there is no axiomatic formal system whose theorems are 
the standardly valid formulas of L2.,. By Proposition A.11, the generally 
valid formulas of L2., are the theorems of the second-order theory P:/2. 
Hence, the set of standardly valid formulas of L2., is different from the set 
of generally valid formulas of L2.,. Since all generally valid formulas are 
standardly valid, there must be some standardly valid formula that is not 
generally valid. 

We can exhibit an explicit sentence that is standardly valid but not gener- 
ally valid. The Gédel-Rosser incompleteness theorem (Proposition 3.38) can 
be proved for the second-order theory AR2. Let be Rosser’s undecidable 
sentence for AR2+ If AR2 is consistent, 7 is true in the standard model of 
arithmetic. (Recall that 7 asserts that, for any proof in AR2 of -¥, there is a 
proof in AR2, with a smaller Gédel number, of —” If AR2 is consistent, 7 
is undecidable in AR2 and, therefore, there is no proof in AR2 of 4, which 


* Lindstrém (1969) has shown that, in a certain very precise sense, first-order logic is the stron- 
gest logic that satisfies the countable compactness and Skolem-—Léwenheim theorems. So 
general models really are disguised first-order models. 

* We must assume that AR is consistent. 
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makes 7 trivially true.) Hence, AR2 >is standardly valid, by Proposition 
AA. However, AR2 >.vis not generally valid. For, if AR2 >. were generally 
valid, it would be provable in Pv 2, by Proposition A.11. Hence, .7 would be 
provable in AR2, contradicting the fact that it is an undecidable sentence of 
AR2. 


Exercise 


A.9 a. Show that the second-order theory AR2 is recursively undecidable. 


b. Show that the pure second-order predicate calculus P:v 2 is recur- 
sively undecidable.* 


It appears that second-order and higher-order logics were the implicitly 
understood logics of mathematics until the 1920s. The axiomatic charac- 
terization of the natural numbers by Dedekind and Peano, the axiomatic 
characterization of the real numbers as a complete ordered field by Hilbert 
in 1900, and Hilbert’s axiomatization of Euclidean geometry in 1902 (in the 
French translation of his original 1899 book) all presupposed a second-order 
logic in order to obtain the desired categoricity. The distinction between 
first-order and second-order languages was made by Léwenheim (1915) and 
by Hilbert in unpublished 1917 lectures and was crystal clear in Hilbert and 
Ackermann’s (1950),, where the problem was posed about the completeness 
of their axiom system for first-order logic. The positive solution to this prob- 
lem presented in Gédel (1930), and the compactness and Skolem—Léwenheim 
theorems that followed therefrom, probably made the use of first-order logic 
more attractive. Another strong point favoring first-order logic was the fact 
that Skolem in 1922 constructed a first-order system for axiomatic set theory 
that overcame the imprecision in the Zermelo and Fraenkel systems.* Skolem 
was always an advocate of first-order logic, perhaps because it yielded the 
relativity of mathematical notions that Skolem believed in. Philosophical 
support for first-order logic came from WV. Quine, who championed the 
position that logic is first-order logic and that second-order logic is just set 
theory in disguise. 

The rich lodes of first-order model theory and proof theory kept logicians 
busy and satisfied for over a half-century, but recent years have seen a revival 
of interest in higher-order logic and other alternatives to first-order logic, 


* The pure second-order monadic predicate logic MP2 (in which there are no nonlogical con- 
stants and no function variables and all second-order predicate variables are monadic) is 
recursively decidable. See Ackermann (1954) for a proof. The earliest proof was found by 
Léwenheim (1915), and simpler proofs were given by Skolem (1919) and Behmann (1922). 

+ Hilbert and Ackermann (1950) is a translation of the second (1938) edition of a book which 
was first published in 1928 as Grundziige der theoretischen Logik. 

+ See Moore (1988) and Shapiro (1991) for more about the history of first-order logic. Shapiro 
(1991) is a reliable and thorough study of the controversies involving first-order and second- 
order logic. 
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and the papers in the book Model-Theoretic Logics (edited by Barwise and 
Feferman 1985) offer a picture of these new developments.* Barwise (1985) 
lays down the challenge to the old first-order orthodoxy, and Shapiro (1991) 
and Corcoran (1987, 1998) provide philosophical, historical, and technical 
support for higher-order logic. Of course, we need not choose between first- 
order and higher-order logic; there is plenty of room for both. 


* Van Benthem and Doets (1983) also provide a high-level survey of second-order logic and its 
ramifications. 


Appendix B: First Steps in 
Modal Propositional Logic 


(#) It is necessary that 1+1=2. 


(##) It is possible that George Washington never met King George III of 
England. 


These assertions are examples of the application of the modal operators “nec- 
essary” and “possible.” We understand here that “necessity” and “possibil- 
ity” are used in the logical or mathematical sense.* There are other usages, 
such as “scientific necessity,” but, unless something is said to the contrary, 
we Shall hold to the logical or mathematical sense. 

In its classical usage, the notation Oa stands for the assertion that o is neces- 
sary. Here, “x” and other lowercase Greek letters will stand for propositions. 
In traditional modal logic, a was taken to assert that « is possible. As before, 
as basic propositional connectives we choose negation — and the condi- 
tional >. The other standard connectives, conjunction A, disjunction Vv, and the 
biconditional <, are introduced by definition in the usual way. The well-formed 
formulas (wfs) of modal propositional logic are obtained from the propositional 
letters A,, A,, Az, ... by applying —, >, and in the usual ways. Thus, each A;is 
a wf, and if « and £ are wfs, then (-q), (« > B), and (Cia) are wfs. The expression 
(\a) is defined as (- (A © 9), since asserting that « is possible is intuitively 
equivalent to asserting that the negation of « is not necessary. 

We shall adopt the same conventions for omitting parentheses as in 
Chapter 1, with the additional proviso that 0 and ¢ are treated like =. For 
example, (- (A (7 A,))) is abbreviated as - 1 — A,, and DA, => A, is an abbre- 
viation of (GA,) > A,). Finally, to avoid writing too many subscripts, we 
often will write A, B, C, D instead of A,, A,, Az, Ag. 

Our study of modal logic will begin with the study of various axiomatic 
theories. A theory is a set of formulas closed with respect to two rules: the 
traditional modus ponens rule 
(MP): B follows from a and a > 6 
and the necessitation rule 
(N): Oa follows from a 


* We could also say that the sentence “1 + 1 = 2” is logically necessary and the sentence “George 
Washington never met King George III of England” is logically possible by virtue of the mean- 
ing of those sentences, but some people prefer to avoid use of the concept of “meaning.” If we 
should find out that George Washington actually did meet King George III, then the sentence 
(##) would still be true, but uninteresting. If we should find out that George Washington 
never did meet King George III, then, although (##) is true, it would be silly to assert it when 
we knew a stronger statement. In general, a sentence A logically implies that A is possible. 
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An axiomatic theory will be determined in the usual way by a set of axioms 
and the rules of inference (MP) and (N). Thus, a proof in such a theory is a 
finite sequence of wfs such that each wf in the sequence is either an axiom or 
follows by (MP) or (N) from the preceding wfs in the sequence, and a theorem 
of a theory is the last wf of any proof in the theory. We shall study several 
modal theories that have historical and theoretical importance. For some of 
those theories, the principal interpretation of O« will not be “a is necessary.” 
For example, Oa may mean that « is provable (in a specified formal system). 

Inclusion of the necessitation rule (N) may call for some explanation, since 
we ordinarily would not want the necessity of a to follow from a. However, 
the systems to be studied here initially will have as theorems logical or 
mathematical truths, and these are necessary truths. Moreover, in most of 
the systems that are now referred to as modal logics, rule (N) will be seen 
to be acceptable. Axiomatic systems for which this is not so will not include 
rule (N).* 

By K we shall denote the modal theory whose axioms are the following 
axiom schemas: 


(Al) All instances of tautologies 
(A2) All wfs of the form O(@ => B) > (Ga > Of) 


These are reasonable assumptions, given our interpretation of the necessity 
operator 1, and they also will be reasonable with the other interpretations 
of 0 that we shall study. 

By anormal theory we shall mean any theory for which all instances of (A1) 
and (A2) are theorems. Unless something is said to the contrary, in what fol- 
lows all our theories will be assumed to be normal. Note that any extension* 
of a normal theory is a normal theory. 


Exercises 


B.1 a. Ifaisa tautology, then |—, Oa. 
b. Ifa > Bisa tautology, then |-, O« > Of 
c. Ifa> Bisa tautology and |-, Ho, then |-, Dp. 


d. Ifa <6 is a tautology, then |-, Oa © Of. (Note that, ifa = Pisa 
tautology, then so area > B and 6 > a.) 


e. If |-~a=> 8, then |-, Hoa => Of. (Use (N) and (A2).) 
If |-, «=> B, then |-, Oa > Of. (First get |-, 7 B > ~ a and use (e).) 


* When needed, the effect of rule (N) can be obtained for special cases by adding suitable 
axioms. 

* A theory V is defined to be an extension of a theory U if all theorems of U are theorems of V. 
Moreover, an extension V of U is called a proper extension of U if V has a theorem that is not 
a theorem of U. 
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B.2 a. 


A 


f. 


|-x O(a A B) > Oa and |-, O(a A Bp) > OB. (Use the tautologies 
aApB >aandaA f= B, and B.1(b).) 

|-x Oo > (OB > O@ A p)). (Use the tautology « > (B > « A p), and 
B.1(b),) 

|-x Oo A OB @ Of A B)). (Use (a) and (b).) 

If |-,a @ 8, then |-, O(a = f). 

|-x O(@ = B) > (Qa © Of). (Use (©) and (A2), recalling that y = 6 is 
defined as (y > 5) A> )),) 

If |-,a@f then |-, Oa = Of. 


B.3 |-x O@ V B) S Sa V OB. (Note that - O(a Vv B) is provably equivalent* in 
K to O- (a V B), which, by B.1(d), is, in turn, provably equivalent in K 
toO (7 aA -— 8) and the latter, by B.2(c), is provably equivalent in K to 
O-«A0O 78. This wf is provably equivalent in K to= (-O70V 70-8), 
which is = (a V Of). Putting together these equivalences, we see that 
(a V B) is provably equivalent in K to = (6a v $8) and, therefore, that 
(a V B) is provably equivalent in K to Oa v $8.) 


B.4 a. |-, Oa @ 74 > 7 4 (note that Oa is, by B.1(d), provably equivalent in 


B.5 


Cc. 


Tepe roan s 


K to O- 7a, and the latter is provably equivalent in K to = -O a, 
which is = © = q) 


|-,7Daeo70 
|-nx70ae0D7a 

|-x O( + ~) > O(a - 8) 
|-~ Oa ~) > (Q7-4«e0-8) 
|-« O(a  f) > (0 & Of) 

|-. 7 DOaeoo74 

|-x Oo Vv OB > Ol@ v £). 

I-K O(@ A B) > Oa A OP. 

|-x Ho > (OB > OA ). 


Define « v B as O(a => 8). This relation is called strict implication. 


B.6 a. 
b. 
C. 


d. 


|-xKav Bp S7O@A-8) 

|-xavpABPvy>avy 

|-x Ha > (6 v a) (this says that a necessary wf is strictly implied by 
every wf) 


|-x O- « > (@ v f) (this says that an impossible wf strictly implies 
every wf) 


|-x Do © (( a) v a) 


* To say that y is provably equivalent in K to § amounts to saying that |-, y @ 6. 
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Theorem B.1 Substitution of Equivalents 


Let I, be a modal wf containing the wf « and let I’, be obtained from I, by 
replacing one occurrence of a by B. If |-x a B, then | T, oT. 


Proof 


Induction on the number n of connectives (including D) inT«. Ifn = 0, thenT, 
is a statement letter A and wis A itself. SoT, is B, the hypothesis is |-, A = B, 
and the required conclusion |-, T°, @ I, is the same as the hypothesis. For 
the inductive step, assume that the theorem holds for all wfs having fewer 
than n connectives. 


Case 1.1, has the form — A,. Then I; has the form 7 A, where A, is obtained 
from A, by replacing one occurrence of « by f. By the inductive hypothesis, 
|-x Ay & Ay. Hence, |-% 7 A, @ 7 A, which is the desired conclusion. 


Case 2.1, has the form A, > A,. By the inductive hypothesis, |-, A, = A, 
and |-x A, @ Ag where A, and A, are obtained from A, and A, by replacing 
zero or one occurrence of « by B. But from |-, A, @ A, and |-_« A, @ A, one 
can derive |—, (A, > A,) @ (A; > A,), which is the required conclusion |—, 
Tyol>. 


Case 3.1’, has the form DA,. By the inductive hypothesis, |-; A, = As, where 
A, is obtained from A, by replacing one occurrence of « by B. By B.2(f), |-x 
DA, = DA, that is, |-,T, oT}. 


Note that, by iteration, this theorem can be extended to cases where more 
than one occurrence of « is replaced by f. 


Theorem B.2 General Substitution of Equivalents 


The result of Theorem B.1 holds in any extension of K (ie., in any modal the- 
ory containing (Al) and (A2)). The same proof works as that for Theorem B.1. 


Exercise 


B.7 a |-.O00 (AV B) 3 o00 GA>=>B) 
b. |-x (Qa > > of) @ >> (Ge > O78) 


Notice that, if Oo is interpreted as “a is necessary,” acceptance of the neces- 
sitation rule limits us to theories in which all axioms are necessary. For, if « 
were an axiom that is not a necessary proposition, then Da would be a theo- 
rem, contrary to the intended interpretation. Moreover, since it is obvious that 
(MP) and (N) lead from necessary propositions to necessary propositions, all 
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theorems would be necessary. So, if we want an applied modal logic with 
some theorems that are not necessary, as we would, say, in physics, then we 
would be impelled to give up rule (N). To still have the theorem on substitu- 
tion of equivalences, we would have to add as axioms certain necessary wfs 
that would enable us to obtain the result of Exercise B.2(f), namely, if |-, 
a = B, then |-, Oa © Of. However, to avoid the resulting complexity, we 
shall keep (N) as a rule of inference in our treatment of modal logic. 

Let us extend the normal theory K to a stronger theory T by adding to K 
the axiom schema: 

(A3) All wfs of the form Da > a. 

This schema is sometimes called the necessity schema. It asserts that every 
necessary proposition is true. We shall show later that T is stronger than K, 
that is, that there are wfs Oa => « that are not theorems of K. Note that, in 
T, strict implication o« v B implies the ordinary conditional « > 6 (which is 
sometimes called a material implication). 


Exercise 


B.8 a. |-; «=> a (use the instance 0 = « > - a of (A3) and an instance of 
the tautology (y > = 5) > 65-7) 


b. |-; OOo > Oa (replace « by Ha in (A3)) 
c. |->O... Oa > Oa (for any positive number of 1's in the antecedent) 
d. |7a050...00 


Now we turn to an extension of the system T obtained by adding the axiom 
schema: 

(A4) All wfs of the form Oo > OOa 

This enlarged system is traditionally designated as $4. This notation is 
taken from the work Symbolic Logic (New York, 1932) by C.I. Lewis and C.H. 
Langford, one of the earliest treatments of modal logic from the standpoint 
of modern formal logic. 

The justification of (A4) is not as straightforward as that for the previous 
axioms. If Da is true, then the necessity of « is due to the form of a, that is, 
to the logical structure of the proposition asserted by «. Since that structure 
is not an empirical fact, that is, it does not depend on the way our world is 
constituted, the truth of Oa is necessary. Hence, Oa follows. 


Exercises 


B.9 a. |—s4 O00 > OOO (and so on, for any number of 1's) 
b. |-sq Oo > 1... Oa (and so on, for any number of 1's) 
c.  |-o4 Oo = Oa (use (A3) and B.8(b)) 
d. |-., Oo @ 0... Oo (use (b) and B.8(c)) 
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B.10 a. |—s, D0 @ 7 = Do 
b. |-<, O0e @ O- - Dao (use B.2€)) 
c. |-s4 00 @ O- 7 Dao (use (b) and B.9(c)) 
d. |- 07 «@0-70 a (replace « by = ain (©) 
e 
f 


l-s 7 Doa@-7-0--D-« 
|-s4 Oa & Oa (abbreviation of (e)) 
g. |g OU SO... Oa 


By B.9(d) and substitution of equivalents, any sequence of consecutive L's 
within a modal wf y can be replaced by a single occurrence of 0 to yield a wf 
that is provably equivalent to y in S4 or any extension of 54. By B.10(g), the 
same holds for instead of O. 


Exercise 


B.11 (A Sharper Substitution of Equivalents) Let T., be a modal wf containing 
the wf « and let I’, be obtained from I, by replacing one occurrence 
of « by B. Then |-, O(@ = B) > ((, @ [,). (Hint: The proof, like that of 
Theorem B.1, is by induction, except that the induction wf, instead of 
being 

“If |-x a @ B, then | T, oT,” is now “|-, Oa & B) > (1, & Ty)” 

When n = 0, we now need (A3). For Case 3 of the induction step, we 
must use B.1¢), (A4), and B.2(e).) Note that we can extend B.11 to the case 
where two or more occurrences of « are replaced by f. 


Exercises 


B.12 a. |-s, O00 > a (use (A3) to get |-s, Oa > Oa and then B.1(f) to 
get |—c,4 O0Oa > Oa and then B.10(f)) 
b. |-y. OOa > OO01a (use B.8(a) to get |-<, DOa« > OOo and then 
B.le) to get |-cs4 DOOa > 000109; finally, apply B.9(0)) 
c. |-gg Hoa @ OOO (apply B.1(e) to B.12(a), and use B.12(b)) 
d. |-c, OOo & OOOH (use (c) and negations) 


B.13 Consecutive occurrences of O's and/or ©’s can be reduced in S4 to 
either 0 or © or 06 or O0f or OOO or O00. 


We have seen (in B.9(d) and B.10(g)) that the axiom (A4) entails that con- 
secutive occurrences of the same modal operator are reducible in 54 to 
a single occurrence of that operator. Now we shall introduce a similar 
simplification when there are consecutive occurrences of different modal 
operators. 
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Let S5 be the theory obtained by adding to T the following schema: 


(A5) oa >Hea 


This amounts to saying that a proposition that is possible is necessarily possible. 
Now we aim to show that (A4) is provable in S5. 


Exercises 


B.14 a. |-35 © 7a > 1 7 4 (use (A5) with o replaced by = a) 
|-s5 7 0 7 a > = © 7a (use contrapositive of (a)) 
|-s5 O 7. © 7a @ 7010 7 a (use B.4(b)) 

|-g5 © 7. O 743074 (use (b) and (0) 

|-s5 OM > Oo (use (d), B.4(a), and Theorem B.2) 

|-s5 Da > Oa (use B.8(a), with a replaced by Oa) 
|-<; OOo & Do (use () and (f)) 

|—s5 Oa > Oa (use (A5), and (A3) with a replaced by >a) 
|-s5 OOa = OOo (replace « by Ha in (a)) 

|-s5 Ha => Oa (use B.8(a)) 

|->5; Oo > OODa (use (b) and (0) 

|-;; D« > OOo (apply Theorem B.1 to (d) and B.12(g)) 


B.15 


ep anresn mp an s 


Note that B.15(e) is (A4). Since (A4) is a theorem of 55, it follows that S5 is an 
extension of S4. We shall prove later that (A5) is not provable in $4, so that S5 
is a proper extension of 54. 


Exercise 


B.16 If OOo => Oa (which, by B.13(e), is a theorem of 55) is added as an axiom 
schema to T, show that schema (A5) becomes derivable. (Hence, OOa > 
CMa could be used as an axiom schema for 55 instead of (A5).) 


Notice that the two theorem schemas Hoa © Oa and Oa } Oa of S55 
enable us (by substitution of equivalents) to reduce any sequence of modal 
operators in a wf to the last modal operator of the sequence. 


Exercise 


B.17 Find a modal wf that is provably equivalent in S5 to the wf 


and contains no sequence of consecutive modal operators. 
Note that the justification of (A5) is similar to the justification of (A4). If Oa 
is true, then the fact that « is possible is due to the form of a, that is, to the 
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logical structure of the proposition asserted by a. Since that structure is not 
an empirical fact, that is, it does not depend on the way our world is consti- 
tuted, the truth of Oa is necessary. Hence, 0¢« follows. 


Semantics for Modal Logic 


Recall that the basic wfs in propositional modal logic consist of the denumer- 
able sequence Aj, Aj, Az, ... of propositional letters. By a world we shall mean 
a function that assigns to each of the letters A,, A,, A, ... the value t (true) 
or the value f (false). By a Kripke frame* we shall mean a nonempty set W of 
worlds together with a binary relation R on W. If wRw* for worlds w and 
w*, then we shall say that w is R-related to w*. (Note that we are making no 
assumptions about the relation R. It may or may not be reflexive and it may 
or may not have any other special property of binary relations.) 

Assume now that the pair (W, R) is a Kripke frame. (W, R) determines a 
truth value t or f for each modal wf « in each world w of W according to the 
following inductive definition, where the induction takes place with respect 
to the number of connectives (including D) in the wf a. If there are no con- 
nectives in a, then a is a letter A; and the truth value of « in w is taken to be 
the value assigned to A; by the world w. For the inductive step, assume that 
the wf a has a positive number 1 of connectives and that the truth value 
of any wf B having fewer than n connectives is already determined in all 
worlds of W. If a is a negation = f or a conditional 6 > y, then the number of 
connectives in B and the number of connectives in y are smaller than n and, 
therefore, the truth values of B and y in every world w in W are already deter- 
mined. The usual truth tables for = and > then determine the truth value of 
-B and 6 > y inw. Finally, assume that « has the form OB. Then the number 
of connectives in B is smaller than n and, therefore, the truth value of B in 
every world w in W is already determined. Now define the truth value of OB 
in a world w in W to be t if and only if the truth value of B in every world w* 
to which w is R-related is t. In other words, Df is true in w if and only if, for 
every world w* in W such that wRw%, f is true in w*. 

A wf wis said to be valid in a Kripke frame (W, R) if it is true in every world 
w in W. «is said to be universally valid if it is valid in every Kripke frame. 


Exercises 


B.18 For any world w in a Kripke frame (W, R), Of is true in w if and only if 
there is a world w* in W such that wRw%* and 6 is true in w*. 
(Hint: Since OB is 7 0 = B, Of is true in w if and only if O + is not true 
in w. Moreover, O — £ is not true in w if and only if there exists w* in 
W such that wRw* and - — £ is true in w*. But = — £ is true in w* if and 
only if B is true in w*.) 


* In honor of Saul Kripke, who is responsible for a substantial part of the development of mod- 
ern modal logic 
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B.19 If wand « > # are true in a world w in a Kripke frame, then so is B. 
B.20 Every modal wf that is an instance of a tautology is universally valid. 
B.21 a and o > f are universally valid, so is p. 


B.22 If « is universally valid, so is Oa. (Hint: If Oa is not universally valid, 
then it is not in some Kripke frame (W, R). Hence, Ha is not true in some 
world w in W. Therefore, « is not true in some world w* in W such that 
wRw”. Thus, o is not valid in (W, R), contradicting the universal validity 
of a.) 


B.23 O(a > B) > (Ga > Of) is universally valid. (Hint: Assume the given wf 
is not valid in some Kripke frame (W, R). So it is false in some world w 
in W. Hence, O(a > f) is true in w, and Ha > Of is false in w. Therefore, 
Oa is true in w and [f is false in w. Since Of is false in w, f is false in 
some world w* such that wRw*. Since O(@ => £) and Ha are true in w, 
and wRw%, it follows that « > 6B and ao are true in w*. Hence, f is true in 
w*, contradicting the fact that B is false in w*,) 


B.24 a. The set of universally valid wfs form a theory. (Use B.21 and B.22.) 
b. The set of valid wfs in a Kripke frame form a theory. 
c. If all the axioms of a theory are valid in a Kripke frame (W, R), then 
all theorems of the theory are valid in (W, R). 
B.25 All theorems of K are universally valid. (Use B.20-B.23.) 


Theorem B.3 OA, => A, Is Not Universally Valid 


Proof 


Let w, be a world that assigns f to every propositional letter, and let w, be 
a world that assigns t to every propositional letter. Let W = {w,, w,}, and let 
R bea binary relation on W that holds only for the pair (w,, w,). (W, R) is a 
Kripke frame and DA, is true in the world w, of that Kripke frame, since A, 
is true in every world w* of W such that w,Rw’%. (In fact, w, is the only such 
world in W, and A, is true in w,.) On the other hand, A, is false in w,. Thus, 
DA, = A, is false in w,.So OA, > A, is not valid in the Kripke frame (W, R) 
and, therefore, DA, > A, is not universally valid. 


Corollary B.4 OA, => A, Is Not a Theorem of the Theory K 


Proof 


Theorem B.3 and Exercise B.25. 

Corollary B.4 shows that the theory T is a proper extension of K. 

Let us call a Kripke frame (W, R) reflexive if R is a reflexive relation 
(i.e., wRw for all w in W,) 
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Exercise 


B.26 Every wf Oa => « is valid in every reflexive Kripke frame (W, R). 


(Hint: If Oa is true in a world w in W, then a is true in every world w* 
such that wRw*. But wRw and, therefore, « is true in w. Thus, Oa > ais 
true in every world in W.) 


Exercise 


B.27 Every theorem of T is valid in every reflexive Kripke frame (W, R). 
(Use Exercise B.24(c) and B.26.) 


Let us call a Kripke frame (W, R) transitive if R is a transitive relation.* 


Exercise 


B.28 Every wf Oa > Oe is valid in every transitive Kripke frame (W, R). 


(Hint: Let w be a world in W, and assume Lo true in w. Let us then show 
that Oa is true in w. Assume wRw’*, where w* is a world in W. We must 
show that Ho is then true in w*. Assume w*Rw*, where w* is in W. By tran- 
sitivity of R, wRw’*. Since Ha is true in w, o is true in w*. Hence, Ha is true 
in w*. So OO is true in w. Thus, Qa > OO is true in w for every w in W.) 


Exercise 


B.29 Every theorem of $4 is valid in every reflexive, transitive Kripke frame. 


Theorem B.5 There Is a Reflexive Kripke Frame 
in Which DA, > ODA, Is Not Valid 


Proof 


Let w, and w, be worlds in which A, is true and let w, be a world in which 
A, is false. Let W = {w,, wz, w3} and assume that R is a binary relation in W 
for which w,Rw,, w,Rw,, w,Rw;, w,Rw,, and w,Rw,, but R holds for no other 
pairs. (In particular, w,Rw; is false.) Now, DA, is true in wy, since, for any w* 
in W, if w,Rw%* then A, is true in w*. On the other hand, ODA, is false in w,. 
To see this, first note that w,Rw,. Moreover, DA, is false in w,, since w,Rw, 
and A, is false in w3. 

Thus, DA, is true in w, and ODA, is false in w,. Hence, DA, > ODA, is false 
in w, and, therefore, not valid in (W, R). 


* Ris said to be transitive if and only if whenever xRy and yRz, then xRz. 
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Corollary B.6 OA, > ODA, Is a Theorem of S4 That Is Not a Theorem of T 


Proof 


Use Theorem B.5 and Exercise B.27. 


Corollary B.7 S4 Is a Proper Extension of T 


We want to show now that S5 is a proper extension of S4. Let us say that a 
Kripke frame (W, R) is symmetric if the relation R is symmetric in W, that is, 
for any w and w* in W, if wRw%*, then w*Rw. 


Exercise 


B.30 Ifa Kripke frame (W, R) is symmetric and transitive, then every instance 
a => Oa of (A5) is valid in (W, R). (Hint: Let w be a world in W. 
Assume a is true in w. We wish to show that Oa is true in w. Since 
Sa is =O a, O - a is false in w. Hence, there is a world w* in W such 
that wRw* and — « is false in w*. So « is true in w*. In order to prove 
that 1a is true in w, assume that w* is in W and wRw*. We must prove 
that Oa is true in w*. Since (W, R) is symmetric and wRw*, it follows 
that w*Rw. Since wRw* and (W, R) is transitive, w*Rw*. But o is true in 
w*. So, by B.16, Oa is true in w*) 


Exercise 


B.31 All theorems of 55 are valid in every reflexive, symmetric, transitive 
Kripke frame. 


Exercise 


B.32 }A, > OA, is not a theorem of 54. (Hint: Let W = {w,, w,}, where w, 
and w, are worlds such that A, is true in w, and false in w,. Let R be the 
binary relation on W such that w,Rw,, w,Rw,, and w,Rw,, but w,Rw, 
is false. R is reflexive and transitive. A, is true in w, and false in wy. 
OA, is false in w, because w,Rw, and $A, is false in w,. Hence, oA; > 
O¢A, is false in w, and, therefore, not valid in the reflexive, transitive 
Kripke frame (W, R). Now use B.29.) 


Exercise 


B.33 S5 is a proper extension of S4. 
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B.34 S5 and all its subtheories (including K, T, and S4) are consistent. (Hint: 
Transform every wf @ into the wf a* obtained by deleting all D’s. Each 
axiom of S5 is transformed into a tautology. Every application of MP 
taking wfs B and 6 > y into y is transformed into an application of MP 
taking B* and B* > y* into y*, and every application of N taking B into 
OB simply takes §* into 6*. Hence, the transform of every theorem is a 
tautology. So it is impossible to prove a wf 5 and — 6 because in such a 
case both 5* and — 8* would be tautologies.) 


Exercise 


B.35 The theory obtained by adding to the theory T the schema 


(B) a> Hea 


will be denoted B and called Becker’s theory.* 


a. 


b. 


o 


f. 


g. 


All instances of B are provable in $5, and therefore, S5 is an exten- 
sion of B. (Hint: Use B.8(a) and (A5).) 


A, > OA, is not provable in S4 and, therefore, not provable in T. 
(Hint: Use the Kripke frame in the Hint for B.32.) 


B is a proper extension of T. 


All theorems of B are valid in every reflexive, symmetric Kripke 
frame (W, R). (Hint: Since (W, R) is reflexive, all theorems of T are 
valid in (W, R). We must show that every wf « > Oa is valid in 
(W, R). Assume, for the sake of contradiction, that there is a world 
w in W such that « is true and Oa is false in w. Since Oa is false 
in w, there exists a world w* in W such that wRw* and «a is false in 
w*. Since R is symmetric, w*Rw and, since « is true in w, Oa is true 
in w*, which yields a contradiction.) 


OA, > ODA, is not a theorem of B. (Hint: Let W = {w,, W2, ws}, where 
Wj, W2, W; are worlds such that A, is true in w, and w, and false 
in w3. Let R be a reflexive, binary relation on W such that w,Rw,, 
w,Rw,, W,Rw;, W3Rw, hold, but w,Rw, and w3Rw, do not hold. Then 
DA, is true in w, and, since DA, is false in w,, ODA, is false in w,. 
Hence, DA, > ODA, is false in w, and, therefore, is not valid in the 
reflexive, symmetric Kripke frame (W, R). So, by (d), OA; > ODA, is 
not a theorem of B.) 


54 is not an extension of B. 
S5 is a proper extension of B. (Hint: Use (a) and (f).) 


* B is usually called the Brouwerian system and 5 is called the Brouwerian axiom because of 
a connection with L.E.J. Brouwer’s intuitionism. However, the system was proposed by O. 
Becker in Becker [1930]. 


Appendix C: A Consistency Proof 
for Formal Number Theory 


The first consistency proof for first-order number theory S was given by 
Gentzen (1936, 1938). Since then, other proofs along similar lines have been 
given by Ackermann (1940), Schtitte (1951). As can be expected from Gédel’s 
second theorem, all these proofs use methods that apparently are not avail- 
able in S. Our exposition will follow Schiitte’s proof (1951). 

The consistency proof will apply to a system S,, that is much stronger 
than S. Soo is to have the same individual constant 0 and the same function 
letters +, , ‘as S, and the same predicate letter =. Thus, S and S,, have the 
same terms and, hence, the same atomic formulas (i.e., formulas s = t, where s 
and t are terms). However, the primitive propositional connectives of S,, will 
be v and ~, whereas S had 3 and ~ as its basic connectives. We define a wf of 
S,, to be an expression built up from the atomic formulas by a finite number 
of applications of the connectives Vv and ~ and of the quantifiers (x) (@ = 1, 
2, ...). We let > .vstand for (~./) Vv .% then any wf of S is an abbreviation of 
a wf of S,,. 

A closed atomic wf s = ¢ (i.e.,an atomic wf containing no variables) is called 
correct, if, when we evaluate s and t according to the usual recursion equa- 
tions for + and -, the same value is obtained for s and t; if different values are 
obtained, s = t is said to be incorrect. Clearly, one can effectively determine 
whether a given closed atomic wf is correct or incorrect. 

As axioms of S,, we take: (1) all correct closed atomic wfs and (2) negations 
of all incorrect closed atomic wfs. Thus, for example, (0”) - (0”) + 0” = (0’”) - (0”) 
and 0’ + 0” #0’. 0” are axioms of S,.. 

S,, has the following rules of inference: 


I. Weak rules 


EV AVNBENG 
a. Exchange: ———*—~ 
CN AN ANG 

: : V Vo 
b. Consolidation: ~~ 7 * * 
IMD 
II. Strong rules 


Y 


a. Dilution: 


(where . is any closed wf) 


SNF 


IN |YZ~ BNG 


b. De Morgan: — 


~(V BN B 


407 
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c. Negation: —7Y ~ 
~~ ANY 
d. Quantification: erate? (where t is a closed term) 
(~ (x).v(@))v 9 
7(n)v 7) 


e. Infinite induction: for all natural number n 


((x).v(x))v 7 
i Cie 


CNG 


In all these rules, the wfs above the line are called premisses, and the wfs 
below the line, conclusions. The wfs denoted by “and “are called the side wfs 
of the rule; in every rule either or both side wfs may be absent—except that 
D must occur in a dilution (II(a)), and at least one of 7 and Yin a cut (III). For 
~~“ is a cut, and ~~~” is an instance of De Morgan's 
7) ~( YN 4“ 

rule, II(b). In any rule, the wfs that are not side wfs are called the principal 
wfs; these are the wfs denoted by ./ and .vin the earlier presentation of the 
rules. The principal wf ./ of a cut is called the cut wf; the number of proposi- 
tional connectives and quantifiers in ~./ is called the degree of the cut. 

We still must define the notion of a proof in S,,. Because of the rule of 
infinite induction, this is much more complicated than the notion of proof in 
S. A G-tree is defined to be a graph the points of which can be decomposed 
into disjoint “levels” as follows: At level 0, there is a single point, called the 
terminal point; each point at level i + 1 is connected by an edge to exactly one 
point at level i; each point P at level 7 is connected by edges to either zero, one, 
two, or denumerably many points at level i + 1 (these latter points at level 
i+ 1 are called the predecessors of P); each point at level i is connected only to 
points at level i — 1 ori + 1; a point at level i not connected to any points at 
level i + 1 is called an initial point. 

Examples of G-trees. 


% 
example, 


Cc Level 4 


A B Level 3 
D Level 2 


Level 1 


E Level 0 
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E 


1. A, B, C, D, are initial points. E is the terminal point. 
2. A, B, Cy, Cy, C3, ... are the initial points. E is the terminal point. 
3. A is the only initial point. 


E is the terminal point. 
By a proof-tree, we mean an assignment of wfs of S,, to the points of a G-tree 
such that 


1. The wfs assigned to the initial points are axioms of S,; 


2. The wfs assigned to a non-initial point P and to the predecessors 
of P are, respectively, the conclusion and premisses of some rule of 
inference; 


3. There is a maximal degree of the cuts appearing in the proof-tree. 
This maximal degree is called the degree of the proof-tree. If there are 
no cuts, the degree is 0; 


4. There is an assignment of an ordinal number to each wf occurring in the 
proof-tree such that (a) the ordinal of the conclusion of a weak rule is the 
same as the ordinal of the premiss and (b) the ordinal of the conclusion 
of a strong rule or a cut is greater than the ordinals, of the premisses. 
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The wf assigned to the terminal point of a proof-tree is called the terminal 
wf; the ordinal of the terminal wf is called the ordinal of the proof-tree. 
The proof-tree is said to be a proof of the terminal wf, and the theorems of 
S,, are defined to be the wfs that are terminal wfs of proof-trees. Notice 
that, since all axioms of S, are closed wfs and the rules of inference 
take closed premisses into closed consequences, all theorems of S,, are 
closed wfs. 

A thread in a proof-tree is a finite or denumerable sequence 4,77, ... of 
wfs starting with the terminal wf and such that each wf «7; ; is a predeces- 
sor of .7;. Hence, the ordinals o,, a, ... assigned to the wfs in a thread do 
not increase, and they decrease at each application of a strong rule or a cut. 
Since there cannot exist a denumerably decreasing sequence of ordinals, it 
follows that only a finite number of applications of strong rules or cuts can 
be involved in a thread. Also, to a given wf, only a finite number of applica- 
tions of weak rules are necessary. Hence, we can assume that there are only 
a finite number of consecutive applications of weak rules in any thread of 
a proof-tree. (Let us make this part of the definition of “proof-tree.”) Then 
every thread of a proof-tree is finite. 

If we restrict the class of ordinals that may be assigned to the wfs of a 
proof-tree, then this restricts the notion of a proof-tree, and, therefore, we 
obtain a (possibly) smaller set of theorems. If one uses various “construc- 
tive” segments of denumerable ordinals, then the systems so obtained and 
the methods used in the consistency proof later may be considered more or 
less “constructive”. 


Exercise 


(7V.a)V BZ 4 7V(Y VB) 


Prove that the associative rules are derivable 


7V(YVZ) (CV Vv)V FZ 


from the exchange rule, assuming association to the left. Hence, parentheses 
may be omitted from a disjunction. 


Lemma A.1 


Let «7 be a closed wf having n connectives and quantifiers. Then there is a proof of 
~~ N «/ of ordinal < 2n + 1 Gin which no cut is used). 


Proof 


Induction on n. 


1.n = 0. Then .y is a closed atomic wf. Hence, either ./ or ~./ is 
an axiom, because A is either correct or incorrect. Hence, by 
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one application of the dilution rule, one of the following is a 


proof-tree. 
~ a . . 
taste v dilution 
dilution or V/V ~ uy 
~ AN exchange 
~ GN A 


Hence, we can assign ordinals so that the proof of ~./ Vv .v has ordinal 1. 
2. Assume true for allk <n. 


Case (i): «vy is «7, V +7. By inductive hypothesis, there are proofs of ~., V .v 
and ~:7 V »% of ordinals <2(n — 1) + 1 = 2n — 1. By dilution, we obtain 
proofs of ~14, V4 V ovgand ~174 V «4, V 7, respectively, of order 2n and, by 
De Morgan's rule, a proof of ~( V 1%) V 04 V «7, of ordinal 2n + 1. 


Case (ii): «vis ~.4. Then, by inductive hy pothesis, there is a proof of ~7 V.vof 
ordinal 2n — 1. By the exchange rule, we obtain a proof of .7V ~.7 of ordinal 
2n — 1, and then, applying the negation rule, we have a proof of ~~.7 V~.%, 
that is of ~.v Vv, of ordinal 2n < 2n + 1. 


Case (iii): «v is (x).7(x). By inductive hypothesis, for every natural number k, 
there is a proof of ~7V of ordinal <2n — 1. Then, by the quantification rule, 
for each k there is a proof of (~ ().7(*))v “() of ordinal <2n and; hence, by 
the exchange rule, a proof of .4(k)v ~ (x).4(x) of ordinal <2n. Finally, by an 
application of the infinite induction rule, we obtain a proof of ((x).7(x)) V ~(x) 
Ax) of ordinal <2n + 1 and, by the exchange rule, a proof of (~(x).7(x)) Vv 
(x). A(x) of ordinal <2n + 1. 


Lemma A.2 


For any closed terms t and s, and any wf (x) with x as its only free variable, the 
wi s#tV ~:/(s) V -v(f) is a theorem of S,, and is provable without applying the 
gut rule. 


Proof 


In general, if a closed wf .4(t) is provable in S,,, and s has the same value as t, 
then .A(s) is also provable in Soo. (Simply replace all occurrences of ¢ that are 
“deductively connected” with the t in the terminal wf .4(f) by s.) Now, if s has 
the same value 7 as t, then, since ~ (1) v «/(N) is provable, it follows by the 
previous remark that ~A(s) Vv A() is provable. Hence, by dilution, s 4 t V ~./(s) V 

v(#) is provable. If s and t have different values, s = t is incorrect; hence, s # t is 
an axiom. So, by dilution and exchange, s # t V ~:/(s) V «/(f) is a theorem. 
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Lemma A.3 


Every closed wf that is a theorem of S is also a theorem of S,,. 


Proof 


Let A be a closed wf that is a theorem of S. Clearly, every proof in S can be 
represented in the form of a finite proof-tree, where the initial wfs are axioms 
of Sand the rules of inference are modus ponens and generalization. Let n be 


an ordinal assigned to such a proof-tree for .. 
If n = 0, then A is an axiom of S. 


1. 


Note that the ordinals of these proofs are bounded by 2k + 1, where k is the 


vis 4D (7D .%), that is, ~7V (~7 Vv .4). But, ~#7V vis provable in 
S,. (Lemma A.1). Hence, so is ~7V ~7 V 4 by a dilution and an 
exchange. 


.°Y IS (4ID(FDADI(SDAD(AD Y)), that is, ~-7V ~7V ~J)V 


~(~2V C) Vv (~#V D). By Lemma A.1, we have ~(~.7V 7) V ~2v vand 
(~4V ~7V YN ~(~ZV ~eVv 7). Then, by exchange, a cut (with 7 as 
cut formula), and consolidation, ~~7V ~7V Z)V ~(~2V AV ~BNV FD 
is provable. 


. SAS (~AD ~/) D (HAD «7v) D A), that is, ~-~7V ~) V ~-~FV 


V 2. Now, by Lemma A.1 we have ~B v B, and then, by the negation 
rule, ~~~.7V .4, and, by dilution and exchange, 
a wn FV OOH AV AV FB. 
Similarly, we obtain ~~~.7V .7V ~~.vand ~vV .4V ~ ~.v, and by 
De Morgan's rule, ~(~ ~7V «v) V.7V ~ ~A; then, by exchange, 


b. ~~ V (~~ FV VY) V 


From (a) and (b), by De Morgan's rule, we have ~(~~.7 V ~.v V 
(~~ FV VA)NV GZ. 


. iS (Xx). A(x) D A), that is, (~@).AX) V 7). Then, by Lemma A.1, we 


have ~.4(f) V .7(6); by the quantification rule, (~(x).7(X) Vv .7(@). 


. of iS (X(AD 7) D (4D (x)A), where x; is not free in B, that is, ~(x)(~7V 


7(x) V ~AV (x)7 (x). Now, by Lemma A-1, for every natural number n, 
there is a proof of ~(~ 4vv(n))v ~.4v 7(N). 


number of propositional connectives and quantifiers in ~.7 V7 (x).) 
Hence, by the quantification rule, for each n, there is a proof of 


~(xX)(~ AV e(x))vV~ Av e(n) (of ordinal < 2k +2) 


Hence, by exchange and infinite induction, there is a proof of 


~(x)(~ By C(x))v ~ Bv (x)C(x) = (of ordinal < 2k +3) 
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(S1) ./is t, =t) D (ty =t, Dt = t,), that is, AL VELL V t= ty 
Apply Lemma A-2, with x; = t; as »(x), t, as s, f, as t. 


(S2) vis t; = t, D (t,)' = (¢t,)’, that is, t; # t, V (f,)' = ¢,)’. If t, and t, have the same 
value, then so do (f,)’ and (t,)’. Hence (¢,)’ = (¢,)’ is correct and therefore an 
axiom. By dilution, we obtain t, # t, Vv (,)’ # (f,)’. If t; and t, have different 
values, ft, # t, is an axiom; hence, by dilution and exchange, t, # t, V (£,)’ = (f,)’ 
is provable. 


(S3) «vis 0 #t’. 0 and t’ have different values; hence, 0 # t’ is an axiom. 
(S4) «vis (t,)’ = ¢,)’ D t, = t,, that is, (f,)’ # (,)’ V t, = t. (Exercise.) 
(S5) «vist +0=t.t +0 and t have the same values. Hence, + 0 =t is an axiom. 


(S6)-(S8) follow similarly from the recursion equations for evaluating close 
terms. 


(89) vis 100) > (X40) DA’) D (0). 400), that is, 
~ B(0) v ~(x)(~ .4(x)v a(x‘) v (x). a(x) 


1. Clearly, by Lemma A.I, exchange and dilution, 
~ B(O)v ~ (x)(~ B(x) v B(x’) v B(O) is provable. 
2. For k > 0, let us prove by induction that the following wf is provable: 
~ AQ ~(~ 20)V.aQL))v Vv ~ (© (kV Ak ))V A(k). 


a. For k = 0, b,~ ~.4(0)v ~ 4(0)v 4(1) by Lemma A.1, dilution, 
and exchange; similarly, ty,,~ 4(1)v ~.4(0)v.4(1). Hence, by 
De Morgan's rule, fy, ~ (~ .7(0)v .4(1))v ~.2(0)v .2(1) ()), and by 
exchange, 


bs ~ “OV ~(~ 2(0)v a(1))v AL) 


b. Assume for k: 


bs ~ 20 ~(~ 2(0)v a1) v... 
vn (~ alk ~ ak))v ak’) 


Hence, by exchange, negation, and dilution, 


bs ~ ~ alk) ~ 20) ~(~ 2(0)v.a(1)) vv... 
v~(~ alk) ~ 2({k'’))v 2k’) 
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Also, by Lemma A.1 for .7(k"), dilution and exchange, 


ky, ~ B(k")v ~ B(O)v ~ (~ B(0)v B(1)) vv... 
v ~(~ B(k)v ~ B(k’)) v Bik") 


Hence, by De Morgan’s rule, 


by ~ (~ alk ~ Ak" ~ 2(0)v ~(~ 7(0))v (1) vv... 
v~(~ alk)v ~ ak) v Ak") 


and, by exchange, the result follows for k + 1. 
Now, applying the exchange and quantification rules k times to the result 
of (2), we have, for each k > 0, 


bs ~ ZO) ~(x)Aa(x)v Z(x')) va... 
Vv ~(x)(~ a(x ~ 4(x'))v ACK’) 


and, by consolidation, s,,~ .7(0)v ~ (x)(~.4(x)v.4(x'))v »(k’). Hence, together 
with (1), we have, for all k > 0, 


bs ~ 2 (Ov ~(x)(~.4(x)v 2(x'))v a(R) 


Then, by infinite induction, 


Fs. AO) ~~ a(x) v 4%) v (x). 4) 
Thus, all the closed axioms of S are provable in S,,. We assume now that 
n > 0. Then, (i) A may arise by modus ponens from 4 and %D .v, where 
# and 2D «v have smaller ordinals in the proof-tree. We may assume that 
B contains no free variables, since we can replace any such free variables by 
0 in B and its predecessors in the proof-tree. 

Hence, by inductive hypothesis, by, .4 and bs, .7 D uv, that is, by, ~.7 Vv. 
Hence, by a cut, we obtaint;, «7. The other possibility (ii) is that A is (x)B(x) and 
comes by generalization from B(x). Now, in the proof-tree, working backward 
from B(x), replaces the appropriate free occurrences of x by n. We then obtain a 
proof of .7(7), of the same ordinal. This holds for all n; by inductive hypothesis, 
Fs,, 4(f) for all n. Hence, by infinite induction, bs, (x).7(x), that is, ky, 7. 


Corollary A.4 


If S,, is consistent, S is consistent. 
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Proof 


If Sis inconsistent, then +0 4 0. Hence, by Lemma A-3,t,,, 0 #0. But,l,, 0 =0, 
since 0 = 0 is correct. For any wf. of S,, we would have, by dilution, 
r,, 0#0v «7, and, together with',, 0 =0, by acut,k,, .v. Thus, any wf of S,, 
is provable; so S,, is inconsistent. 

By Corollary A.4, to prove the consistency of S, it suffices to show the 
consistency of S.,. 


Lemma A.5 


The rules of De Morgan, negation, and infinite induction are invertible, that is, from 
a proof of a wf that is a consequence of some premisses by one of these rules one can 
obtain a proof of the premisses (and the ordinal and degree of such a proof are no 
higher than the ordinal and degree of the original proof). 


Proof 


1. De Morgan. .v is ~(4V <@) V % Take a proof of .~ Take all those 
subformulas ~(.7 V ~) of wfs of the proof-tree obtained by starting 
with ~(.4 V ¢) in «vy and working back up the proof-tree. This pro- 
cess continues through all applications of weak rules and through 
all strong rules in which ~(.7 v “) is part of a side wf. It can end 


only at dilutions a or applications of De Morgan’s rule: 
~(AVA)\V IF 
TAN OY the set of all occurrences of ~(.2V <) obtained by 
~(AVA)V IF 


this process is called the history of ~(4 Vv ~). Let us replace all occur- 
rences of ~(.7 V ~) in its history by ~B. Then we still have a proof-tree 
(after unnecessary formulas are erased), and the terminal wf is ~.7 v 7. 
Similarly, if we replace ~(% V ~) by, ~E we obtain a proof of ~E v D. 

2. Negation ./ is ~~ V 7. Define the history of ~ ~.7as was done for 
~(4V .%) in (4); replace all occurrences of ~ ~.7in its history by B; the 
result is a proof of .7 v 7. 

3. Infinite induction. «7 is ((x).4(x)) V 7. Define the history of (x).7(x) as 
in (1); replace (x). A(x) in its history by .7(7) (and if one of the initial 
occurrences in its history appears as the consequence of an infi- 
nite induction, erase the tree above all the premisses except the one 
involving 71); we then obtain a proof of 4(1)v 7. 


Lemma A.6 


(Schtitte 1951: Reduktionssatz). Given a proof of «/ in S,,, of positive degree m and 
ordinal a, there is a proof of ./ in S,, of lower degree and ordinal 2°. 
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Proof 


By transfinite induction on the ordinal « of the given proof of «~, « = 0: this 
proof can contain no cuts and, hence, has degree 0. Assume the theorem 
proved for all ordinals < «. Starting from the terminal wf ., find the first 
application of a nonweak rule, that is, of a strong rule or a cut. If it is a strong 
rule, each premiss has ordinal «a, < a. By inductive hypothesis, for these prem- 
isses, there are proof-trees of lower degree and ordinal 2”. Substitute these 
proof-trees for the proof-trees above the premisses in the original proof. We 
thus obtain a new proof for ./ except that the ordinal of «v should be taken to 
be 2%, which is greater than every 2". 

The remaining case is that of a cut: 


CN DA ~BNG 


CNG 


If the ordinals of ¢ Vv. zand ~.7 V Yare a, d, then, by inductive hypothesis, 
we can replace the proof-trees above them so that the degrees are reduced 
and the ordinals are 2", 2”, respectively. We shall distinguish various cases 
according to the form of the cut formula .7: 


a. zis an atomic wf. Either .7 or ~.7 must be an axiom. Let .7 be the 
non-axiom of .7 and ~.% By inductive hypothesis, the proof-tree 
above the premiss containing .7 can be replaced by a proof-tree with 
a lower degree having ordinal 2“ = 1 or 2). In this new proof-tree, 
consider the history of .v (as defined in the proof of Lemma A-5). The 
initial wfs in this history can arise only by dilutions. So, if we erase 
all occurrences of vin this history, we obtain a proof-tree for ~ or for 
7 of ordinal 2"; then, by dilution, we obtain ~ v 7, of ordinal 2%. The 
degree of the new proof-tree is less than m. 


EN~E ~~ENG 


b. Bis 


EV GD 
There is a proof-tree for ~ ~~ v 7 of degree < m and ordinal 2”. By 
Lemma A-5, there is a proof-tree for « Vv 7 of degree < mand ordinal 
2” There is also, by inductive hypothesis, a proof-tree for 7 Vv ~« of 
degree < m and ordinal 2". Now, construct 


ONG CNG 


Exchange 


Exchange 


OV é& BVE 
Cut 


DNC 


Exchange 


CNG 
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The degree of the indicated cut is the degree of ~« that is one less 


than the degree of ~~ ~/ which, in turn, is <m. The ordinal of 7 v 7 
can be taken to be 2”. Hence, we have a proof of lower degree and 
ordinal 2*. 
; _ E€NEN GF (EN F)VG 
C. BiSeVF: 
END 


There is a proof-tree for~(“ Vv .7) V 7 of lower degree and ordinal 2”. 
Hence, by Lemma A-5, there are proof-trees for ~“V vand ~.7 Vv 7of 
degree <m and ordinal 2”. There is also a proof-tree for .7 V « V “of 
degree <m and ordinal 2". Construct 


Cut 
CNMING 


Consolidation 


The cuts indicated have degrees < m; hence, the new proof-tree has 
degree < m; the ordinal of 7 v “ Vv 7can be taken as Zmar(oro2) 4 ol and 
then the ordinal v7 v 7 Vv yand v V Yas 2%. 


deer Cee 


CN GD 

By inductive hypothesis, the proof-tree above ~ V (x) can 
be replaced by one with smaller degree and ordinal 20,. By Lemma 
A.5 and the remark at the beginning of the proof of Lemma A.2, we 
can obtain proofs of ~ v (t) of degree <m and ordinal 2°, for any 
closed term t. Now, the proof-tree above the right-hand formula 
(~(x)4) V 7 can be replaced, by inductive hypothesis, by one with 
smaller degree and ordinal 2”. The history of ~(x) in this proof ter- 
minates above either at dilutions or as principal wfs in applications 
of the Quantification rule: 


~é (th) Vv Gi 


(~ (x) VG 
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Replace every such application by the cut 
ia Vv &(t1) (~ a(ty))v Gi 
CN Gi 


Replace all occurrences in the history of ~(x)(x) by ~. The result is still a 
proof-tree, and the terminal wf is ~ v »%. The proof-tree has degree < m, 
since the degree of ~/(t,) is less than the degree of ~(x). Replace each old 
ordinal B of the proof-tree by 2%' + 9B. If B was the ordinal of the premiss 
~(x)“(t,) V 4% of an eliminated quantification rule application earlier, and if 
y was the ordinal of the conclusion (~(x~)4) Vv %, then, in the new cut intro- 
duced, 7 Vv «(f) has ordinal 2“', ~E(é,) Vv G, has ordinal 2“, and the conclu- 
sion 7 V % has ordinal 2% +9 y >max(2™', 2" +) 8). At all other places, the 
ordinal of the conclusion is still greater than the ordinal of the premisses, 
since 5 <) p implies 2° + )5<)2"' + op. Finally, the right-hand premiss 
(~@)4) Vv 7 (originally of ordinal a) goes over into 7 V ¥ with ordinal 
DON U2 Pa a2 ae | ees): oma aay = Deena) a | eo" F his 
is <)2%, the ordinal of C v D can be raised to 2°. 


Corollary A.7 


Every proof of ./ of ordinal « and degree m can be replaced by a proof of ./ of ordinal 
22 ond degree 0 (i.e., a cut-free proof). 


Proposition A-8 
S,, is consistent. 


Proof 


Consider any wf .7 of the form (0 # 0) v (0 #0) v ... v 0 # 0). If there is a 
proof of .v, then by Corollary A-7, there is a cut-free proof of .. By inspection 
of the rules of inference, ./can be derived only from other wfs of the same 
form: (0 4 0) v ... V (0 #0). Hence, the axioms of the proof would have to be 
of this form. But there are no axioms of this form; hence, ./ is unprovable. 
Therefore, S,, is consistent. 


Answers to Selected Exercises 


SS _ 
Chapter 1 
11 A B 
T T F 
F T T 
T F T 
F F F 
12 A B 7A A=>B (A=>B)v-A 
T T F T T 
F T T T T 
T F F F F 
F F T T T 
13 (A => B) a _ A) 
T T T T T 
F T T FF 
T F F F T 
F T F FF 
14 a. ((A=(B))A((4A) > (-8B))) 
c. (A=>B), A: x is prime, B: x is odd. 
d. (A=>B), A: the sequence s converges, 


1.5 


B: the sequence s is bounded. 
(AS(BA(CAD))) A: the sheikh is happy, 

B: the sheikh has wine, 

C: the sheikh has women, 

D: the sheikh has song. 
(A => B), A: Fiorello goes to the movies. 
(A) => B), A: Kasparov wins today, 

B: Karpov will win the tournament. 


(c), (A), (f), (g), @), Gj) are tautologies. 


419 


420 


1.33 
1.34 
1.36 


1.37 
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(a), (b), (d), ©), (f) are logically equivalent pairs. 


All except (i). 

Only (c) and (e). 

a. (BSA7A)AC () ASBe-(CvD) 

c. Drop all parentheses. (g) --—7(B v C) = (B } C)) 

a. (C V(CA) A B)) (©) (C >(C( V B) > C)) A A) & B) 

a. ((ACA)) & A) © (B Vv C)) @) and (f) are the only ones that are not 


abbreviations of statement forms. 

V = C-AB and vC > AB=DC 

(a) \=> BAC (b) VAV BC 

(i) is not. (ii) (A > B) >(B > C) > CAS COC) 

is contradictory, and (a), (d), (e), (g)—-() are tautologies. 
(b)-(d) are false. 

a. T(b)T (© indeterminate 

a. AisT,BisKand-Av (A= B)isE 

c 

c 


man p 


A is T, Cis T, B is T. 
(i) A A(B A C) V GB A =C)) (ii) AA BA AC 
(iii) -A V (7B AC) 

a. If vis a tautology, the result of replacing all statement letters by 
their negations is a tautology. If we then move all negation signs 
outward by using Exercise 1.27 (k) and (1), the resulting tautology 
is —7’. Conversely, if =.7’ is a tautology, let v be 7.7’. By the first 
part, >~’ is a tautology. But 47’ is -7.z 


ec (AAAABA-7AC)V(AABA-D) 
a. For figure 1.4: 


1B 
(a), (d) and (h) are not correct. 


a. Satisfiable: Let A, B, and C be F, and let D be T. 
For f, 


(AABAC)v (AA ABAC)vV(AA=BAC)v (AHA A-=B a =C) 
For => and V, notice that any statement form built up using > and v 


will always take the value T when the statement letters in it are T. In 
the case of = and $, using only the statement letters A and B, find all 
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1.40 
1.41 
1.42 


1.43 
1.45 


1.47 


1.48 


the truth functions of two variables that can be generated by applying 
= and © any number of times. 


a. 


24 = 16 (b) 2” 


AC, C, C)=7C and h(B, B, +C) is B > C. 


b. 


For -(A => B) v (A A C), a disjunctive normal form is (A A =B) v GA 
A C), and a conjunctive normal form is (A Vv C) A (7B v =A) A GB V C). 
(i) For (A A B) v 7A, a full dnf is (A A B) v (=A AB) v =A A -B), and 
a full cnf is B v AA. 

(i) Yes. A: T, B: T, C: F (ii) Yes. A: T, B: FE, C: T 

A conjunction “of the form B; A ... A B;, where each B; is either B; 
or —B,, is said to be eligible if some assignment of truth values to 
the statement letters of .7 that makes .7 true also makes ~« true. 
Let ~ be the disjunction of all eligible conjunctions. 


1 ¢=9 Hypothesis 

2. 2% Hypothesis 

3. (23 (723 D242 49> (¥%> 7) Axiom (A2) 

4. (43 Y>(4%>R (e> Y) Axiom (A1) 

5. F>(%> 7) 1,4, MP 

6. (42 9>(%> 7) 3, 5, MP 

7. BOD 2,6, MP 

1 @>78 Lemma 1.11(b) 

2, Z>07>7) Lemma 1.11(c) 

3. FZ2eA2>7) 1, 2, Corollary 1.10(a) 
4. F>(4%V7 3, Abbreviation 
lLoavS> 7 Hypothesis 

2. > A> O77) Lemma 1.11(e) 

3, APD 7A 1,2, MP 

4, 47> 7 Lemma 1.11(a) 

5. AZ>F 3, 4, Corollary 1.10(a) 
6. 27> ZEAZ> + 1-5 

7 FG@rs> A> C7>%7) 6, deduction theorem 
8 FYVVA>(4VZ 7, abbreviation 


Take any assignment of truth values to the statement letters of .4 that 
makes .# false. Replace in each letter having the value T by A, v 7A,, 
and each letter having the value F by A, A =A,. Call the resulting state- 
ment form 7. Thus, 7is an axiom of L*, and, therefore, F,;*7. Observe 
that ~ always has the value F for any truth assignment. Hence, 77 is a 
tautology. So F, >” and, therefore, Fx 7 7. 
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1.51 (Deborah Moll) Use two truth values. Let > have its usual table and 
let = be interpreted as the constant function F. When B is F, -B > —A) 
>(-B > A)> B)isE 

1.52 The theorems of P are the same as the axioms. Assume that P is suitable for 
some n-valued logic. Then, for all values k, k  k will be a designated value. 
Consider the sequence of formulas .4 = A,.4,; =A *.4. Since there are n” 
possible truth functions of one variable, among .%, ...,.4,nthere must be 
two different formulas .4 and . 4, that determine the same truth function. 
Hence, .4*.4 will be an exceptional formula that is not a theorem. 

1.53 Take as axioms all exceptional formulas, and the identity function as 
the only rule of inference. 

ee 

Chapter 2 

21 a. (vx.)( Ale) A (-A1c2)))) (b) (((¥22) Al(x2)) <= Ai(x2)) 
de ((cv9((vx5)((v24)A1(1)))) = (Alera) (-41G0)))) 

22. a ((vx1) (Alen) => Ai(x:))) V (4x1 )Ai (x1) 

2.3. a. The only free occurrence of a variable is that of x,. 
b. The first occurrence of x, is free, as is the last occurrence of x5. 

2.6 Yes, in parts (a), (c) and (e) 

2.8 a. (Vx)(P(x) > L(x) 
b.  (Vx)(P(x) > =H(x)) or 7(4x)(P(x) A A(x) 
c. =(Vx)(B(x) > F(x) 
d. (Vx)(B(x) > 7F(X) (e) TX) > I) 
£  (Wx\(Vy(S(X) A Dex, y) > JQ) 
j. WXCH(, x) > HG, x) or (Vx)(P(x) A 7H(x, x) > Aj, x) 

(In the second wf, we have specified that John hates those persons 
who do not hate themselves, where P(x) means x is a person.) 
2.9 a. All bachelors are unhappy. (c) There is no greatest integer. 
2.10 a. i. Is satisfied by all pairs (x1, x.) of positive integers such that 


Xy°X,>2. 


ii. Is satisfied by all pairs (x, x.) of positive integers such that 
either x, <x, (when the antecedent is false) or x, = x, (when the 
antecedent and consequent are both true). 


iii. Is true. 
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I. 


VI. 


VII. 


Between any two real numbers there is a rational number. 


. A sequence s satisfies -.v if and only if s does not satisfy 


Hence, all sequences satisfy —.7if and only if no sequence satis- 
fies .4; that is, >is true if and only if .7 is false. 


. There is at least one sequence s in X. If s satisfies 4, cannot 


be false for M. If s does not satisfy .4, .7cannot be true for M. 


If a sequence s satisfies both .7 and .7 => ~, then s satisfies ~ by 
condition 3 of the definition. 


. a. 8 Satisfies 7A vif and only if s satisfies (7 > 77) 


if and only if s does not satisfy 47> 7 
if and only if s satisfies .7 but not 77 
if and only if s satisfies and s satisfies” 

a. Assume F,, .7. Then every sequence satisfies .4. In particular, 
every sequence that differs from a sequence s in at most the 
ith place satisfies .7. So, every sequence satisfies (Vx;).%; that is, 
Fy, (Vx). 7. 

b. Assume Fy, (Vx;).7. If s is a sequence, then any sequence that dif- 
fers from s in at most the ith place satisfies .4, and, in particular, 
s satisfies .7. Then every sequence satisfies .7; that is, Fy, 7. 

Lemma. If all the variables in a term t occur in the list ;,, ..., Xi 

(k > 0; when k = 0, t has no variables), and if the sequences s 

and s’ have the same components in the i,th, ..., i,th places, then 

s*Q= (6). 

Proof. Induction on the number m of function letter in t. Assume 

the result holds for all integers less than m. 

Case 1. t is an individual constant a,. Then s+(a,) = (a,)M = (s’)+(a,). 

Case 2. t is a variable x;,. Then s*(x;,) = $i, = $;, = (s’)* (xi). 

Case 3. t is of the form f/'(h,..., tn). For q < n, each t, has 

fewer than m function letters and all its variables occur 

among Xj,,..-, X;- By inductive hypothesis ss(t,) = (s' )*(t,). 


Then ae ))= Can (s*(t),. .., 8* (Ey) ))= (f"). ((s!)* 
(tr), er (S')* (tn) = (8) * (Flay ar tn) 


Proof of (VIII). ao on the number r of connectives and 
quantifiers in .7. Assume the result holds for all q < r. 

Case 1. is of the form Aj (t,,..., f:); that is, r= 0. All the variables 
of each ¢; occur among Xj,, ..., Xi Hence, by the lemma, s«(t) = 
(s’)x(t). But s satisfies Aj'(t, ...,t,) if and only if (s+(t,), ..., s+(t,)) is 
in (A?) "—that is, if and only if (s’(t,), «.., (@a(t,)) is in (A?)™, 
which is equivalent to s’ satisfying Aj (t, ..., tn). 
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Case 2. vis of the form 77. 
Case 3. zis of the form => 7%. Both cases 2 and 3 are easy. 


Case 4. zis of the form (Vx). The free variables of “ occur among 
Xipy oe, Xj, and x; Assume s satisfies .7. Then every sequence that 
differs from s in at most the jth place satisfies 7. Let s* be any 
sequence that differs from s’ in at most the jth place. Let s’ be 
a sequence that has the same components as s in all but the jth 
place, where it has the same component as s*. Hence, s° satisfies 
v. Since s’ and s* agree in the i,th, ..., ith and jth places, it follows 
by inductive hypothesis that s’ satisfies ~ if and only if s* satis- 
fies 7. Hence, s* satisfies 7. Thus, s’ satisfies 7. By symmetry, the 
converse also holds. 


Assume .7 is closed. By (VIII), for any sequence s and s’, s satis- 
fies .vif and only if s’ satisfies .7. If - 7is not true for M, some 
sequence s’ does not satisfy —.7; that is, s’ satisfies .7 Hence, 
every sequence s satisfies .4; that is, Fy 7. 


. Proof of Lemma 1: induction on the number m of function letters 


int. 
Case 1. t is Hj. Then t’ is i. Hence, 


s*(t')=s*(aj)=(a;)" =(s')*(aj) =(s')*(f) 


Case 2. t is x, where j # i. Then t’ is x;. By the lemma of (VIID), 
s*(t’) = (s’)*(@, since s and s’ have the same component in the jth 
place. 

Case 3. t is x, Then t’ is u. Hence, s*(t’) = s*(u), while (s’)*(f) = (s’)*(x) = 
s; = s*(u). 

Case 4. t is of the form f;’(t,...,t:). For 1 < q <n, let ) result 
from t, by the substitution of u for x;. By inductive hypothesis, 
s*(#,) =(s')* (t,). But 


s*(t)=s*(ff'(H, fe) =(FF) (s#(H), - 5*(E4)) 
= (41) (W)C), 18) (b) =)#( FF ss td) =O) 


Proof of Lemma 2(a): induction on the number m of connectives and 
quantifiers in A(x). 

Case 1. m = 0. Then .Ax;) is Aj'(h, ..., tn). Let tj be the result of sub- 
stituting ¢ for all occurrences of x; in t,. Thus, A(#) is Aj (1, ..., th). 
By Lemma 1, s*(t,) =(s')*(t,). Now, s satisfies .4 (t) if and only if 
(s* (ff), 0.2, s*(t,))belongsto( A") , whichis equivalent to ((s’)*(t),..., 


(s’)*(t,,)) belonging to (A i "that is, to s’ satisfying .7 (x). 


Answers to Selected Exercises 425 


Case 2. 7 (x) is 37 (x); this is straightforward. 

Case 3. 4 (x) is 7 (x) > 9 (x); this is straightforward. 

Case 4. 4 (x;)) is (Vx) .4 (x). 

Case 4a. x, is x; Then x; is not free in .7(x), and .4 (f) is .4 (x). Since 
x, is not free in .7 (x), it follows by (VHD that s satisfies 7 (f) if and 
only if s’ satisfies .7 (x). 

Case 4b. x; is different from x;. Since t is free for x; in .4(x), t is also 
free for x; in 7 (x). 

Assume s satisfies (Vx;) ~ (f). We must show that s’ satisfies (Vx)) 
(x). Let s* differ from s’ in at most the jth place. It suffices to show 
that s* satisfies ~ (x). Let s’ be the same as s* except that it has the 
same ith component as s. Hence, s’ is the same as s except in its jth 
component. Since s satisfies (Vx) 7 (1), s” satisfies 7 (f). Now, since t is 
free for x; in (Wx) “ (x), t does not contain x;. (The other possibility, 
that x; is not free in 7 (x), is handled as in case 4a.) Hence, by the 
lemma of (VIII), (s’)*(£) = s*(t). Hence, by the inductive hypothesis 
and the fact that s* is obtained from s’ by substituting (s’)*(f) for the 
ith component of s?, it follows that s* satisfies ~ (x), if and only if s? 
satisfies 7 (t). Since s? satisfies 7 (f), s* satisfies 7 (x). 

Conversely, assume s’ satisfies (Vx)) 7 (x). Let s’ differ from s in 
at most the jth place. Let s* be the same as s’ except in the jth 
place, where it is the same as s’. Then s* satisfies 7 (x). As above, 
s*(f) = (s’)*(f). Hence, by the inductive hypothesis, s’ satisfies 7 (f) if 
and only if s* satisfies 7 (x). Since s* satisfies (x), s’ satisfies 7 (f). 
Therefore, s satisfies (Vx) 7 (f). 

Proof of Lemma 2(b). Assume s satisfies (Vx;).7 (x;). We must show 
that s satisfies .7(t). Let s’ arise from s by substituting s*(t) for the 
ith component of s. Since s satisfies (Vx,).7(x;) and s’ differs from s 
in at most the ith place, s’ satisfies .7(x;). By Lemma 2(a), s satisfies 


A(t). 


Assume .#is satisfied by a sequence s. Let s’ be any sequence. By (VIII), 
s’ also satisfies .7. Hence, .7is satisfied by all sequences; that is, Fy, 7. 


a. 
a. 


x is acommon divisor of y and z. (d) x, is a bachelor. 
i. Every nonnegative integer is even or odd. True. 


ii. Ifthe product of two nonnegative integers is zero, at least one 
of them is zero. True. 


iii. 1 is even. False. 


(a) Consider an interpretation with the set of integers as its domain. 
Let Aj(x) mean that x is even and let A}(x) mean that x is odd. Then 
(Vx) Ai(x1) is false, and so (Vx1)At(x1) => (Vx1)A2(x1) is true. However, 
(Wx1)(At(x1) > Ax(x1)) is false, since it asserts that all even integers 
are odd. 
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2.18 a. 
b. 
2.19 b. 
2.21 a. 
2.22 a. 
2.26 a. 
2.27 a. 
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[(Vx) 27 (x) > 2701 > [140 > 7x) 7.4 (x)] is logically valid 
because it is an instance of the tautology (A > —B) > (B > —A). By 
(X), (WVx)>.4 (x) > 7.4 is logically valid. Hence, by (IID), .7 (0 > 
a(Vx)7.4 (x) is logically valid. 

Intuitive proof: If .7is true for all x, then .7 is true for some x;. 
Rigorous proof: Assume (Vx)).7=> (Ax). is not logically valid. Then 
there is an eee M for which it is not true. Hence, there is 


asequence sin ) such thats satisfies (Vx,).zand s does not satisfy 


-(Vx,)-.%. From the latter, s satisfies (Vx,)-.% Since s satisfies (Vx,).%, 
s satisfies .7, and, since s satisfies (Vx,)-.%, s satisfies =.7. But then s 
satisfies both “and —%, which is impossible. 


Take the domain to be the set of integers and let Aj(u) mean that 
u is even. A sequence s in which s, is even satisfies Ai(x;) but does 
not satisfy (Vx,)Ai(%1). 

Let the domain be the set of integers and let A7(x, y) mean that x < 
y. (b) Same interpretation as in (a). 

The premisses are (i) (Vx)(S(x) > N(x) and (ii) (Vx)(ViX) > =N(X), 
and the conclusion is (Vx)(V(x) > —S(x)). Intuitive proof: Assume 
V(x). By (Gi), -N(x). By @), 7S(x). Thus, =S(x) follows from V(x), and 
the conclusion holds. A more rigorous proof can be given along 
the lines of (I)-(X]), but a better proof will become available after 
the study of predicate calculi. 


(3x)(2y)( Ai) 0 4i(y)) 

1. (Wx(47> 7 Hyp 

2. (Wx).4 Hyp 

3. (WX(F4> 7) > (4> 7) Axiom (A4) 

4, 4 >¢% 1,3, MP 

5. (WX). F> .F Axiom (A4) 

6. F 2,5, MP 

7 @ 4,6, MP 

8. (Vx) 7, Gen 

9. (Wx) 77> 7), (WO).AE (Wx) 2 1-8 
10. (Wx). 4 > AE (WX). 4=> (Wx)7 1-9, Corollary 2.6 
11. FWx(7> 7) > (WX). 72> WA 1-10, Corollary 2.6 


2.28 Hint: Assume, .#. By induction on the number of steps in the proof of 
Zin K, prove that, for any variables y,, ..., y,, (1 = 0), FV y4) ... (V y,).7. 


2.31 a. 


1. (vx)(Vy)AT(x,y) Hyp 
2. (Vy)Ai(x,y) 1, Rule A4 
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2.33 


2.36 
2.37 


2.39 


2.46 


2.49 


2.50 
2.55 


2.59 


Ai (x,X) 2, Rule A4 
(Wx)A?(x,x) 3, Gen 
(vx)(Vy)Ai(x,y)F(Wx)AT(x, x) 1-4 
- (vx)(Vy) At (x, y) => (Vx)A?(x,x) 1-5, Corollary 2.6 
a. b A(Vx)-7.7 & 7(Vx)7.4 by the replacement theorem and the fact 
that k 7.47 %. Replace =(Vx)-—. 7 by its abbreviation (Ax)-.7. 
b. (ele >0 A (V8)(65 > 0 > Ax)(|x - c]< 6A >| f) -fO |<¢))) 
i. Assumel.%. By moving the negation step-by-step inward to 
the atomic wfs, show that k —%* © 7, where ~ is obtained 
from .7 by replacing all atomic wfs by their negations. But, 


from +. #it can be shown that kv. Hence, k —.4*. The converse 
follows by noting that (.4*)* is -. 


ON 8, 


(ii) Apply (i) to7.7Vv «. 

(4y)(Vx)(Ai(x,y) @ AAi(x,x)) Hyp 
(Wx)(Ai(x,b) <> A?(x,x)) 1, Rule C 
Ai(b, y) <= =A? (b, b) 2, Rule A4 
4. ¢N7¢ 3, Tautology 


Sy 


(7 is any wf not containing b.) Use Proposition 2.10 and proof by 
contradiction. 


a. Instep 4, bis not a new individual constant. It was already used in 
step 2. 


Assume K is complete and let 7 and 7 be closed wfs of K such that 
Ky .4V ~. Assume not-F, .4. Then, by completeness, fx =. Hence, 
by the tautology = A > (A v B) => B), Fx .% Conversely, assume K is 
not complete. Then there is a sentence .7 of K such that not-F, .zand 
not-Fy 2.4% However, Fx .47V 72. 


See Tarski, Mostowski and Robinson (1953, pp. 15-16). 


b. It suffices to assume .7 is a closed wf. (Otherwise, look at the 
closure of 4.) We can effectively write all the interpretations on 
a finite domain {b,, ..., b,j. (We need only specify the interpreta- 
tions of the symbols that occur in .7) For every such interpreta- 
tion, replace every wf (Vx) ~ (x), where ~ (x) has no quantifiers, 
by 7(b,) A... A 7 (b,), and continue until no quantifiers are left. 
One can then evaluate the truth of the resulting wf for the given 
interpretation. 


Assume K is not finitely axiomatizable. Let the axioms of K, be .4, 
Ay ..., and let the axioms of K, be 71, #2, .... Then {.4, 4, A, 7%, -..} is 
consistent. (If not, some finite subset (4, .%, ..., Ay Gr +++ Gm} iS incon- 
sistent. Since K, is not finitely axiomatizable, there is a theorem B of 
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2.60 


2.61 


2.65 


2.68 


2.70 


2.71 


2.74 
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K, such that .4, .%, ...,.4  .7 does not hold. Hence, the theory with 
axioms {.44, 4, .--, Fy 7A has a model M. Since Fy.% M must be a 
model of K,, and, therefore, M is a model of {.4, A, ---) Fu Cy «0+ Cin} 


contradicting the inconsistency of this set of wfs.) Since {.4, 4, .%, % «+4 
is consistent, it has a model, which must be a model of both K, and K;. 


Hint: Let the closures of the axioms of K be 4, .4, .... Choose a sub- 

sequence .4, 4, ... such that .4,,; is the first sentence (if any) after 

4, that is not deducible from .4, A... A.4, Let C, be 4, A. Ay A... A Ay 

Then the 7; form an axiom set for the theorems of K such that F%,; > 

4, but not-4y% > 4... Then {71 71> % 4 > % «...} is an independent 

axiomatization of K. 

Assume .7 is not logically valid. Then the closure 7 of .7 is not logi- 

cally valid. Hence, the theory K with -/as its only proper axiom has 

a model. By the Skolem—Léwenheim theorem, K has a denumerable 

model and, by the lemma in the proof of Corollary 2.22, K has a model 

of cardinality m. Hence, “is false in this model and, therefore, .7is not 

true in some model of cardinality m. 

ec 1. x=x Proposition 2.23(a) 

2. (ay)x=y 1, rule E4 
3. (Wx)(Ay)x = y 2, Gen 

a. The problem obviously reduces to the case of substitution for a single 
variable at a time: + x, = y, > t(x,) = t(y,). From (A7), Fx, =y, > 
(E(x) = (x1) > tx,) = t(y,)). By Proposition 2.23 (a), F t(x,) = t(x,). Hence, 
FX, = yy, > try) = Hy). 

a. By Exercise 2.65(c), F (Ay)x = y. By Proposition 2.23(b, c), F (Vy) 
(Vz)\(x =yAx=z=> y =2). Hence, F (A,y)x = y. By Gen, F (Vx)(A,y) 
x=y. 

b. i. Let Aj cicj <nX; # xX; Stand for the conjunction of all wfs of the 

form x; # x, where 1 <i <j <n. Let B, be (Ax) ... (AX,) Ay ci gicn 
Xj, F Xp. 

ii. Assume there is a theory with axioms «4, ..., «4, that has the 
same theorems as K. Each of .:4, ..., -% is provable from K, 
plus a finite number of the wfs .4, .4, .... Hence, K, plus a 
finite number of wfs .4,,...,.4, suffices to prove all theorems 
of K. We may assume j,< -*: <j,. Then an interpretation whose 
domain consists of j,, objects would be a model of K, contra- 
dicting the fact that .4,,; is an axiom of K. 


For the independence of axioms (A1)-(A3), replace all ¢ = s by the 
statement form A => A; then erase all quantifiers, terms and associ- 
ated commas and parentheses; axioms (A4)-(A6) go over into state- 
ment forms of the form P > P, and axiom (A7) into (P > P) > (Q=> Q). 
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Co-SO co ts 


OS -O-:@: So: Ds 


2.83 


2.84 
2.87 


2.88 


WNr OD 


WOnNrFroWwW 


For the independence of axiom (A1), the following four-valued logic, 
due to Dr D.K. Roy, works, where 0 is the only designated value. 


rPrero | 
PPP PP DR 
ONrFRooD 
ocoooo | 
NNNN BS 
WONrFROoOwD 
coool 
www wo dp 
ONrFroWw 
orro | 
wonNrF OS 
ooor 


When A and B take the values 3 and 0, respectively, axiom (A1) takes 
the value 1. For the independence of axiom (A2), Dr Roy devised the 
following four-valued logic, where 0 is the only designated value. 


PPro jl 
PPP PP DB 
ONrFoOWw 
coool 
NNNN BS 
WNrR OC 
cooo | 
won ww wo 
WNnNrRoowD 
oroo | 
WNr OD 
CooOoR 


If A, B, and C take the values 3, 0, and 2, respectively, then axiom (A2) 
is 1. For the independence of axiom (A3), the proof on page 36 works. 
For axiom (A4), replace all universal quantifiers by existential quanti- 
fiers. For axiom (A5), change all terms f to x, and replace all universal 
quantifiers by (Vx,). For axiom (A6), replace all wfs t = s by the negation 
of some fixed theorem. For axiom (A7), consider an interpretation in 
which the interpretation of = is a reflexive nonsymmetric relation. 


a. (w)(9) (2) A(Z,X,Y, or Y)AAL (x,y,z))=> (3z) (A(z, ¥, %, «+ %) 
A 


Z=X 
a. (3z)(Vw)(Ax)([ Aix) => AP, y) | >[AI@) = A7(y,2)]) 


y has the form (Ax)(ay)(vz)([ Aix, y) => Ai(x)]=> Ai(2)), Let the 
domain D be {1, 2}, let A? be <, and let Aj(u) stand for u = 2. Then y is 
true, but (Vx)(4y)A7 (x, y) is false. 


Let g be a one-one correspondence between D* and D. Define: 


(a) =o((a))") (Ar (ra) =o" | (HY (al), ~~ (00) 


Eur Al[ bi, ..., by Jif and only if FyA?| 9(bi),..., 9(bn) | 
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Hint: Extend K by adding axioms .4,, where .4, asserts that there are at 


least 1 elements. The new theory has no finite models. 


(a) Hint: Consider the wfs .4,, where .%, asserts that there are at least 
n elements. Use elimination of quantifiers, treating the .4,s as if they 
were atomic wfs. 


Let W be any set. For each b in W, let a, be an individual constant. Let 
the theory K have as its proper axioms: 4, # a, for all b, cin W such that 
b #c, plus the axioms for a total order. K is consistent, since any finite 
subset of its axioms has a model. (Any such finite subset contains only 
a finite number of individual constants. One can define a total order 
on any finite set B by using the one-one correspondence between B 
and a set {1, 2, 3, ..., n} and carrying over to B the total order < on 
{1, 2, 3, ..., n}.) Since K is consistent, K has a model M by the general- 
ized completeness theorem. The domain D of M is totally ordered by 
the relation <™; hence, the subset D,, of D consisting of the objects (a,)M 
is totally ordered by <™. This total ordering of D,, can then be carried 
over to a total ordering of W: b <,, c if and only if a, <a,. 


Assume M, is finite and M, = M,. Let the domain D, of M, have n ele- 
ments. Then, since the assertion that a model has exactly n elements 
can be written as a sentence, the domain D, of M, must also have n 
elements. Let D, = {b,, ..., b,} and D, = {cy, ..., c,}. 

Assume M, and M, are not isomorphic. Let @ be any one of the 
n! one-one correspondences between D, and D,. Since @ is not an 
isomorphism, either: (1) there is an individual constant a and an ele- 
ment b; of D, such that either (i) b; =a™ A @ (b;)#a™ or (ii) bj #a™ A 
pb) = a™; or (2) there is a function letter fj" and by,b;,, ..., bj, in D; 
such that 


be=(fe") (Birr vr Bin) and be) #( fi") (Bi), + (Bin) 


or (3) there is a predicate letter A;’ and bj,, ..., bj,, in D, such that either 


i. Fy, Ar’ [Bi sy bj, |and Fu, 7Ax’ [o(bj.), ey (Din )| or 


ii, Fy, AAR | Bi, a Din | and Fu, Aj’ (bi), ee (Bin) | Construct a 
wf .4, as follows: 
xj =a if (1 
xj #a if (1 
Ay iss Xo = fe’ (Xjpp-++1 Xin) if (2) holds 
Ag (Xjpr ese Lin) if (3) (i) holds 
Sy (cag eis) if (3) (ii) holds 


jv 


(i) holds 


) 
) (ii) holds 
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Let @,, ..., @,, be the one-one correspondences between D, and D,. Let 
v be the wf 


(4x1)... (an)[ AA x #Xj A By \ By W ... N “ 


Then .y is true for M, but not for M,. 

a. There are Na sentences in the language, of K. Hence, there are 
2** sets of sentences. If M, = M, does not hold, then the set of 
sentences true for M, is different from the set of sentences true 
for M,. 

Let K* be the theory with Ny new symbols b, and, as axioms, all sen- 

tences true for M and all b, 4b, for t # p. Prove K* consistent and apply 

Corollary 2.34. 

a. Let M be the field of rational numbers and let X = {-1}. 

Consider the wf (4x,)x_ < x}. 

a. ii. Introduce a new individual constant b and form a new theory 

by adding to the complete diagram of M, all the sentences 
b #t for all closed terms t of the language of K. 

If@¢.4.7#4 (A). Conversely, if @ € .% then, by clause (3) of the defini- 

tion of filter, .7 = 7(A). 

If 7=.%, then Nc.,C = B € .% Conversely, if B= Ace ,C € .% then 

F = FR 

Use Exercise 2.113. 

a. AE ZsinceA=A-@. 

b. IfB=A-W,e€.7and C=A-W, € 4 where W, and W, are finite, 
then Bn C=A -(W,UW,) €.4 since W, U W, is finite. 

c If B=A-WeE » where W is finite, and if B C C, then C =A — 
(W -C) €.4 since W - Cis finite. 

d. Let BC C.So,B=A — W, where W is finite. Let b € B. Then W U 
{b} is finite. Hence, C = A —- (W U {b}) € .z But, B Z C, since b EC. 
Therefore, .7# .4. 

Let.7’={D|DCAa(AC\CE ZABNCCD). 

Assume that, for every B C A, either B€ 7 or A-Be «Let vbea 

filter such that 7c 4% Let BE 7-7. Then A-Be.v.Hence,A-BEe. 

So, @ = Bn (A - B) € vand #is improper. The converse follows from 

Exercise 2.118. 

Assume .¥ is an ultrafilter and B ¢ .4%C ¢ .% By Exercise 2.119, A - B 

€ .zand A-Ce.% Hence, A- (BUC) =(A- B)n (A -C) € .% Since .7 

is proper, B UC ¢ .~ Conversely, assume B¢ AC € 7 >BUCE Z 
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Since B U (A - B) =A €.% this implies that, if B ¢.4 then A — B € .% Use 

Exercise 2.119. 

a. Assume is a principal ultrafilter. Let a € C and assume C {a}. 
Then {a} ¢ .~% and C — {a} ¢ .%. By Exercise 2.120, C = {a} U (C — {a}) 
¢ %, which yields a contradiction. 

b. Assume a nonprincipal ultrafilter 7 contains a finite set, and let B 
be a finite set in 7 of least cardinality. Since .7 is nonprincipal, the 
cardinality of B is greater than 1. Let b € B. Then B — {b} 4 @. Both 
{b} and B — {b} are finite sets of lower cardinality than B. Hence, 
{b} € .7 and B — {b} ¢ .~ By Exercise 2.120, B = {b} u (B — {b}) €.% 
which contradicts the definition of B. 

Let J be the set of all finite subsets of I. For each A in J, choose a model 

M, of A. For A in J, let A* = {A’|A’ € JA A C A}. The collection ¢ of all 

A‘s has the finite-intersection property. By Exercise 2.117, there is a 

proper filter 72 «. By the ultrafilter theorem, there is an ultrafilter 


F'D FD %. Consider | M,/7'. Let ET. Then {(4* e¢C 7". 
Ae] 


Therefore, {7}+< {A|Aev Fu, 7} €.7'. By Los’s theorem, 7 is true 

in T1..M”7 : 

a. Assume 7 is closed under elementary equivalence and ultraprod- 
ucts. Let A be the set of all sentences of that are true in every 
interpretation in ~% Let M be any model of A. We must show that 
M isin % Let T be the set of all sentences true for M. Let J be the 
set of finite subsets of ND. For I’ = {.4, ...,.4} € J, choose an interpre- 
tation Np. in 7 such that .4 A... A .%, is true in Np. (If there were 
no such interpretation, =(.4 A ... A 4), though false in M, would 

be in A.) As in Exercise 2.124, there is an ultrafilter “ such that 

N*t= e /Nr /7' is a model of 2 Now, N* € ~% Moreover, M = 
N*. Hence, M € % 

b. Use (a) and Exercise 2.59. 

Let 7 be the class of all fields of characteristic 0. Let be a nonprin- 
cipal ultrafilter on the set P of primes, and consider M = [iz p/Z. 
Apply (b). peP 

R* C R*. Hence, the cardinality of R* is > 2°. On the other hand, R° 

is equinumerous with 2° and, therefore, has cardinality 2°. But the 

cardinality of R* is at most that of R°. 

Assume x and y are infinitesimals. Let e be any positive real. Then 

|x| <e/2 and |y| < e/2.So0, |x +y| < |x| + |y|<e/2+e/2 =e |xy| = 

|x| |y|<l-e=e; |x-y| < |x| + |-y|<e/2+e/2=«. 

Assume | x| <r, and |y| < e for all positive real e. Let ¢ be a posi- 

tive real. Then e/r, is a positive real. Hence | y| <e/r,, and so, |xy| = 

IxI| yl<rie/r) =e. 
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Assume x — r, and x — r, are infinitesimals, with r, and r, real. Then 

(x —1,) -(« -— 1.) = 1, — r, is infinitesimal and real. Hence, r, — r, = 0. 

a. x-—st(x) and y - st(y) are infinitesimals. Hence, their sum (x + y) — 
(st(x) + st(y)) is an infinitesimal. Since st(x) + st(y) is real, st(x) + 
st(y) = st(x + y) by Exercise 2.130. 

a. By Proposition 2.45, s*(n) & c, and u*(n) & c, for all n € a — a. 
Hence, s * (n) + u*(n) & Cc, + Cy for all n € w* — w. But s*(7) + u*(n) = 
(s + u)*(n). Apply Proposition 2.45. 

Assume f continuous at c. Take any positive real e. Then there is a 

positive real 5 such that (Vx)(x € BA |x —cl< 8 => |f(x) - fo |< ® 

holds in .”. Therefore, (Vx)(x € B* A |x-c|<6=> |f*(x) —f |<) holds 
in .7*. So, if x © B* and x = c, then |x — c| <6 and, therefore, | f*(x) — 

f(|< e. Since e was arbitrary, f*(x) = f(c). Conversely, assume x € B* A 

xc => f*(x) x fc). Take any positive real e. Let 5, be a positive infini- 

tesimal. Then (Vx)(x € B* A |x -—c| <5) > |[f*@) —f© | <e) holds for 

#”*, Hence, (48)(6 > 0 A (Wx)(x € B* A |x-c| <8> |f'() -fO |<) holds 

for 7%, and so, (A8)(6 > 0A (Vx)\(x E BA |x-c| <b> [f(x) -fO |<e) 

holds in 

a. Sincex € B*Ax 2 C= (fX(x) & fC A g*(x) © 8) by Proposition 2.46, 
we can conclude x € Bt Ax xc => (f+ 9)*(x) = (f + 90), and so, by 
Proposition 2.46, f + g is continuous at c. 


a i | (Vx) (Ala) v AX(x)) = (WAT) v (V2) A(x) | 
ii, (Wx)(Al(x)v AX(x)) (i) 
ii, | ((WAIO)) v (VAN) | (i) 
iv. —=(Vx)Al(x) (iii) 
v.  -(Vvx)A}(x) (iii) 
vi. (Ax) AI (x) (iv) 
vii. (Ax)4A3(x) (v) 
viii. —A}(b) (vi) 
ix. —=A3(c) (vii) 
x Alb) v AX) (ii) 

we Xs 
xi. Al (b) Ax(b) (x) 
xii, x = AK) Vs Ao) (ii) 
we NS 
xiii, A(c) Ai(c) (xii) 


x 
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No further rules are applicable and there is an unclosed branch. 
Let the model M have domain {b, c}, let (Ai . hold only for c, 
and let (A3)" hold for only b. Then, (vx)(Ai(x)v A2(x)) is true 
for M, but (Wx)Ai(x) and (Vx)A3(x) are both false for M. Hence, 
(Vx)(Ai(x) v A2(x)) => ((Vx)At(x)) v (Vx)A2(x) is not logically 
valid. 


Chapter 3 


3.4 


3.5 


3.6 


Consider the interpretation that has as its domain the set of polynomi- 
als with integral coefficients such that the leading coefficient is non- 
negative. The usual operations of addition and multiplication are the 
interpretations of + and -. Verify that (S1)—(S8) hold but that Proposition 
3.11 is false (substituting the polynomial x for x and 2 for y). 


a. Forma new theory S’ by adding to S a new individual constant b 
and the axioms b #0,)41,b#2,...,b#N,... Show that S’ is con- 
sistent, and apply Proposition 2.26 and Corollary 2.34(c). 


b. By a cortége let us mean any denumerable sequence of 0s and 1s. 
There are 2“ cortéges. An element c of a denumerable model M of 
S determines a cortége (Sp, S,, 5, ...) as follows: s; = 0 if Fy p;|c, and 
s; = Lif Fy 7(p;|c). Consider now any cortége s. Add a new constant 
b to S, together with the axioms .4(b), where .4(b) is p;|b if s; = 0 and 

Ab) is =(p;|b) if s; = 1. This theory is consistent and, therefore, has 
a denumerable model M,, in which the interpretation of b deter- 
mines the cortége s. Thus, each of the Qo cortéges is determined 
by an element of some denumerable model. Every denumer- 
able model determines denumerably many cortéges. Therefore, 
if a maximal collection of mutually nonisomorphic denumerable 
models had cardinality m < 2*° then the total number of cortéges 
represented in all denumerable models would be < m x Xy< 2™. 
(We use the fact that the elements of a denumerable model deter- 
mine the same cortéges as the elements of an isomorphic model.) 


Let (D, 0, ’) be one model of Peano’s postulates, with 0 € D and’ the 
successor operation, and let (D#, 0#,*) be another such model. For 
each x in D, by an x-mapping we mean a function f from S, = {u|u € 
D Au < x} into D# such that f(0) = 0# and fu’) = (fu) * for all u < x. 
Show by induction that, for every x in D, there is a unique x-mapping 
(which will be denoted f,). It is easy to see that, if x, < x,, then the 
restriction of f,. to S,, must be f,;. Define F(x) = f(x) for all x in D. Then 
Fis a function from D into D# such that F(0) = 0# and F(x’) = (F(x))* 
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for all x in D. It is easy to prove that F is one—one. (If not, a contra- 
diction results when we consider the least x in D for which there is 
some y in D such that x # y and F(x) = F(y).) To see that F is an iso- 
morphism, it only remains to show that the range of F is D#. If not, 
let z be the least element of D# not in the range of F. Clearly, z 4 O#. 
Hence, z = w* for some w. Then w is in the range of F, and so w = F(u) 
for some u in D. Therefore, F(u’) = (F(u))* = w* = z, contradicting the 
fact that z is not in the range of F. 

The reason why this proof does not work for models of first-order 
number theory S is that the proof uses mathematical induction and 
the least-number principle several times, and these uses involve prop- 
erties that cannot be formulated within the language of S. Since the 
validity of mathematical induction and the least-number principle in 
models of S is guaranteed to hold, by virtue of axiom (59), only for wfs 
of S, the categoricity proof is not applicable. For example, in a nonstan- 
dard model for S, the property of being the interpretation of one of the 


standard integers 0,1,2,3, ... is not expressible by a wf of S. If it were, 


then, by axiom (59), one could prove that {0, 1,2,3, say constitutes the 
whole model. 


Use a reduction procedure similar to that given for the theory K, on 
pages 114-115. For any number k, define k - t by induction: 0 - tis 0 and 
(k+1)-tis (k-#)+#; thus, k- tis the sum of t taken k times. Also, for any 
given k, let t = s(mod k) stand for (Ax)\(t =s+k-xvVs=t+k-x). In the 
reduction procedure, consider all such wfs t = s(mod k), as well as the 
wis t <s, as atomic wfs, although they actually are not. Given any wfs 
of S,, we may assume by Proposition 2.30 that it is in prenex normal 
form. Describe a method that, given a wf (ay)7, where ~ contains no 
quantifiers (remembering the convention that t = s(mod k) and t <s are 
considered atomic), finds an equivalent wf without quantifiers (again 
remembering our convention). For help on details, see Hilbert and 
Bernays (1934, I, pp. 359-366). 


b. Use part (a) and Proposition 3.6(a)(i). 
c. Use part (b) and Lemma 1.12. 


Assume f(<,, ..., %,) = Xn is expressible in S by A(x, ..., X41). Let 
7 (Xy, eer Xpa) DE AK, 0-7 Xa) A (W2YZ < X yay DS AA «++, Xya))- Show 
that 7 represents f (x,, ..., x,) in S. [Use Proposition 3.8(b).] Assume, 
conversely, that f (x1, ..., x, is representable in S by (1, ..., X41). Show 
that the same wf expresses f (%1, ..., X= Xn41 in S. 


a AY) ,ycoR (1, ---, Xv Y) is equivalent to (4Z)2<0-(u41)R(X1, ---, Xn, Z+UtF)), 
and similarly for the other cases. 

If the relation R(x, ...,x,, y): f(y -..,X,) = y is recursive, then C, is recur- 

sive and, therefore, so is f(x, ..., X,) = HY(C(%p, ..., X, y) = 0). Conversely, 

if f (1, ...,x,) is recursive, Cp(X,,...,X,Y) =sgl f(x, ...,X,) —y | is recursive. 
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[vn | = 8 (UY ysnealy? “4 n)) 


[ [= doss (co) 


ysn 


[ne] = pao +25] since n 8 1 Hee (S 1 
2! 3! nt | (n+1)! (n+2)! nt 


1 _ 8M) z p 
Let Lt eet ae Then 9(0) = 1 and gv + 1) = 


(n + 1)g(n) + 1. Hence, g is primitive recursive. Therefore, so is 


[ne] = ee) | qt(n!, ng(n)). 


RP(y, z) stands for (Vx),<y.(xly A x|zZ > x = 1). 


o(n) = Dg (Cur(y,n)) 


ysn 


Z(0) = 0,Z(y +1) =U3(y,Z(y)) . 

Let v = (pop, .-. P,) + 1. Some prime gq is a divisor of v. Hence, q < v. But 
q is different from py, Py ---, Pe If q =p; then p;|v and p;|po p -.. py would 
imply that p;|1 and, therefore, p; = 1. Thus, pyiy <q < (Po Pi---Pd) + 1. 


If Goldbach’s conjecture is true, i is the constant function 2. If 
Goldbach’s conjecture is false, h is the constant function 1. In either 
case, h is primitive recursive. 


List the recursive functions step by step in the following way. In the 
first step, start with the finite list consisting of Z(x), N(x), and U(x). At 
the (n + 1)th step, make one application of substitution, recursion and 
the p-operator to all appropriate sequences of functions already in the 
list after the nth step, and then add the n + 1 functions Ui, Seu Xia) 
to the list. Every recursive function eventually appears in the list. 


Assume f,(y) is primitive recursive (or recursive). Then so is f,(x) + 1. 
Hence, f(x) + 1 is equal to f,(x) for some k. Therefore, f,(x) = f,(x) + 1 for 
all x and, in particular, f,(K) = f,(K) + 1. 


a. Letdbe the least positive integer in the set Y of integers of the form 
au + bu, where u and v are arbitrary integers—say, d = auy + bvp. 
Then d|a and d|b. (To see this for a, let a = qd + r, where 0 <r<d. 
Then r =a — qd =a — g(auy + bvy) = (1 — quy)a +  qu,)b € Y. Since d 
is the least positive integer in Y and r < d, r must be 0. Hence d|a.) 
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If a and b are relatively prime, then d = 1. Hence, 1 = auy + bdo. 
Therefore, au) = 1 (mod D). 


a. 1944 = 293°. Hence, 1944 is the Gédel number of the expression (). 
49 =1 + 8(2!3'). Hence, 49 is the Gédel number of the function 
letter fi- 

a. a( fi) = 49 and g(a,) = 15. So, g( fi (a1)) =2°3°5'°7°. 

Take as a normal model for RR, but not for S, the set of polynomials 

with integral coefficients such that the leading coefficient is nonnega- 

tive. Note that (Vx)\(ay)(« =y+yVvx=yty +t I) is false in this model 

but is provable in S. 


Let co be an object that is not a natural number. Let 00’ = ov, oo + 
x =xX+ =o for all natural numbers x, 00 -0=0-co=0,ando-x=x- 
oo = o for all x £0. 


Assume S is consistent. By Proposition 3.37(a), “ is not provable in S. 

Hence, by Lemma 2.12, the theory De is consistent. Now, 77% is equiva- 

lent to (Ax,).7(x,, 7). Since there is no proof of ~in S, Pf (k, q) is false for 

all natural numbers k, where g = "7. Hence, +, >.7(k,) for all natural 
numbers k. Therefore, k,, 7.7 (k,q). But, &, (x2). (x2,q). Thus S, is 
w-inconsistent. 

(G. Kreisel, Mathematical Reviews, 1955, Vol. 16, p. 103) Let .7(x,) be a 

wf of S that is the arithmetization of the following: x, is the Gédel 

number of a closed wf .7such that the theory S + {4} is w-inconsistent. 

(The latter says that there is a wf (x) such that, for every n, “(71) is 

provable in S + {.4}, and such that (Ax)>~ (x) is provable in S + {.4}.) By 

the fixed-point theorem, let ~ be a closed wf such that kz 7@ A(T). 

Let K=S + {7}. () wis false in the standard model. (Assume / true. 

Then K is a true theory. But, 7 © .7("7)) is true, since it is provable 

in S. So, .7(A) is true. Hence, K is w-inconsistent and, therefore, K is 

not true, which yields a contradiction.) (2) K is w-consistent. (Assume 

K w-inconsistent. Then .7("7) is true and, therefore, “is true, contra- 

dicting (1).) 

a. Assume the “function” form of Church’s thesis and let A be an 
effectively decidable set of natural numbers. Then the characteris- 
tic function C, is effectively computable and, therefore, recursive. 
Hence, by definition, A is a recursive set. 


b. Assume the “set” form of Church’s thesis and let f (x, ...,x,) be any 
effectively computable function. Then the relation f (x, ..., x,) = y 
is effectively decidable. Using the functions o*, of of pages 184-185 
let A be the set of all z such that f(oi*'(z),...,0n''(z)) = ontt(2). 
Then A is an effectively decidable set and, therefore, recursive. 


n+1 


Hence, f (x1, .5.;%n) = Ore (uz (Ca(z) = 0)) is recursive. 
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Let K be the extension of S that has as proper axioms all wfs that are 
true in the standard model. If Tr were recursive, then, by Proposition 
3.38, K would have an undecidable sentence, which is impossible. 


Use Corollary 3.39. 


Let f (x1, ..., X,) be a recursive function. So, f (x1, ...,X,) = y is a recursive 

relation, expressible in K by a wf (x1, ..., X,, y). Then fis representable 

by 1 (Xp 6 Xp Y) A (WAZ < Y S (Ky «.-, X, Z), where z < y stands for 

ZSYAZFY. 

a -FO=15 % Hence,+ 4,("0=1)) > 4%,("A) and, therefore, 
bi Z, (A) > -2,,("0 = 17). Thus, F¥> -4,,("0 = 17). 

b. F 42) > Zy(C4&A? 1) 7). Also, k a7 & Z, ("7 7), and so, 
F Bail?) & A(A(? 1) 1). Hence t A (°9 7) > .4,(7 7). By 
a tautology, F => («7 => (7A 4%); hence, F .A,(°9 1) > Ag > 
(7A 72)'). Therefore, + .%,("F7) > (4,(97)) > AAA 2Z))). It 
follows that k .4,,(°77) > 4,((¢A 37). But, k 7 Ang > 0=1;s0, 
+ AA(GA AD) > &,,("0= 1. Thus, F Z,("77) > &,("0 =1, 
andk -%,,(0 = 17) > 34%,,("77). Hence, Fk °4,,("0 = 17) > «. 

If a theory K is recursively decidable, the set of G6del numbers of the- 

orems of K is recursive. Taking the theorems of K as axioms, we obtain 

a recursive axiomatization. 


Assume there is a recursive set C such that T, € C and Refx < C;. Let 

C be expressible in K by ./(x). Let .4 with Gédel number k, be a fixed 

point for 7../(x). Then, ky 4 @ 7.7(k). Since ./(x) expresses C in K, 

Fy o7(k) or Fe av(k). 

a. If ty (k), then Fy =.% Therefore, k €¢ Refk ¢ C. Hence, Fy i(k ) 
contradicting the consistency of K. 

b. IfKK 7/(k), then, .% So, k € Ty € C and therefore, ky ./(k), con- 
tradicting the consistency of K. 

Let K, be the theory whose axioms are those wfs of K, that are prov- 

able in K*. The theorems of K, are the axioms of K,. Hence, x € Tx, if 

and only if Fmlx,(x) Ax €T,.. So, if K* were recursively decidable— 

that is, if T,. were recursive—T,, would be recursive. Since K, is a con- 

sistent extension of K,, this would contradict the essential recursive 

undecidability of K,. 

a. Compare the proof of Proposition 2.28. 


b. By part (a), K* is consistent. Hence, by Exercise 3.60, K* is essentially 
recursively undecidable. So, by (a), K is recursively undecidable. 


b. Take (Wx)(Aj(x) ee x) as a possible definition of Aj. 
Use Exercises 3.61(b) and 3.62. 
Use Corollary 3.46, Exercise 3.63, and Proposition 3.47. 
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Chapter 4 


4.12 


4.15 


4.18 


4.19 


4.22 


4.23 


4.27 
4.30 
4.33 


4.39 
4.40 


4.41 


4.42 


4.43 


4.44 


(s) Assume u € x x y. Then u = (, w) = {{}, {v, w}} for some v in x and w 
iny. Thenv € x Uyandw ex Uy. So, {o} € “(x Uy) and {v, w} €./(x Uy). 
Hence, {{v}, {u, w}} € A.Ax U y)). 

(t)XCYVYCX 

a ANE U(UX and Ax)E U (UX). Apply Corollary 4.6(b). 

b. Use Exercises 4.12(s), 4.13(b), axiom W, and Corollary 4.6(b). 

c. If Rel(Y), then Y € “(Y) x ACY). Use part (b) and Corollary 4.6(b). 

Let X = {(y,, Y)|Y, = Yo Ay, € Y}; that is, X is the class of all ordered 

pairs (u, u) with u € Y. Clearly, Fnce(X) and, for any set x, (Av)((u, u) € X 

AvVEX) Sue YNx. So, by axiom R, M(Y Nn x) 

Assume Fnc(Y). Then Fne(x JY) and a(x f Y) € x. By axiom R, M (Y"x). 

a. Let @be the class {u|u # u}. Assume M(X). Then @ C X. So, @= GNX. 
By axiom S, M (2). 

Assume M(V). Let Y = {x|x ¢ x}. It was proved above that =M(Y). But 

Y CV. Hence, by Corollary 4.6(b), -M(V). 

b. grandparent and uncle 

c. Let ube the least €-element of X — Z. 

a. By Proposition 4.11(a), Trans(@). By Proposition 4.11(b) and 
Proposition 4.8(j), @ € On. If w € K, then @ € , contradicting 
Proposition 4.8(a). Hence, w ¢ Kj. 

Let X, = X x {@} and Y, = Y x {I}. 

For any u C y, let the characteristic function C, be the function with 

domain y such that C,w =@ if w €u and C,w =1if w € y — u. Let Fbe 

the function with domain .7(y) such that F’u = C, for u € .7(y). Then 

P(x)z2% = 

F 

a. For any set u, 7 (u) is a set by Exercise 4.15(a). 

b. Ifuexy,thenu Cy x x.S0, x9 ly x x). 

a. @ is the only function with domain @. 

c. If J(u) #@, then vu) 4 @. 

Define a function F with domain X such that, for any xy in X, F(x9) is 

the function g in X'"! such that g'u = x). Then X=X ame 

Assume X2Y and ZeW. If -M(W), then =-M(Z) and X2 = YY = @ 

by Exercise 4.41(a). Hence, we may assume M(W) and M(Z). Define 


a function ® on X% as follows: if f € X%, let ® 'f = F of o G". Then 
X7=y™. 
o) 
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4.45 


4.46 


4.59 


4.62 


4.63 
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If X or Y is not a set, then Z*” and Z* x ZY are both @. We may assume 
then that X and Y are sets. Define a function ® with domain Z*” as 
follows: if f € ZY, let ® 'f = (XI f Y Lf). Then Z*”” = V hae a dae 

Define a function F with domain (x!) as follows: for any fin (x), let F’f 
be the function in x’? such that (F’f)’(u, v) = (f’v)'u for all (u, v) € y x z. 
Then (x")? ax 

If a M(Z)(X x Y)¥ = @=@ x @ = X% x Y. Assume then that M(Z). 
Define a function F:X% x Y% > (X x Y)% as follows: for any f € X7, 9 € 
YZ, (FX, g))'z = (fz, g'z) for all z in Z. Then X% x Y* =(X x Ae 

This is a direct consequence of Proposition 4.19. 

b. Use Bernstein’s theorem (Proposition 4.23(d)). 

c. Use Proposition 4.23(¢, d). 

Define a function F from V into 2, as follows: Fu = {u, @} if u 4 @; F’O = 
{1, 2}. Since, F is one-one, V < 2.. Hence, by Exercises 4.23 and 4.50, -M(2,). 
(h) Use Exercise 4.45. 

i. Q*K2* + xX2*4+ 2*=2*x222%x DE QX+ 17s 

Hence, by Bernstein’s Theorem, 2* +x = 2*. 

Under the assumption of the axiom of infinity, @ is a set such that (Au)(u € 
w) A (VY\(y € @ > (zz € w A y CZ). Conversely, assume (*) and let b be a 
set such that (i) (Au)(u € b) and (ii) (Vy)(y € b => (Az\(z Eb Ay Cz)). Let 
d = {u|(Az\(z €b Au Cz}. Sinced C .7( U (0, dis a set. Define a relation 
R= {{n, v)|\n €oA0={ulued Au = n}}. Thus, (n, v) € Ris and only 
if n € wand v consists of all elements of d that are equinumerous with n. 
Ris a one-one function with domain w and range a subset of ./(@). Hence, 
by the replacement axiom applied to R-, is a set and, therefore, axiom I 
holds. 

a. Induction on ain (Vx)\(* aA aE wo => Fin(Ax))). 

b. Induction on ain (Vx ZarAaEor(V yy Ex => Fin(y)) > Fin(L »). 
c. Use Proposition 4.27(a). 

d. xe A(UxandyexsyCU~x. 

e. Induction onain (V(x Fanaeo> (xX <yVy <x) 

g. Induction ono in (Vx\x ZaAaE@d Inf(Y)>x<y) 

h. Use Proposition 4.26(c). 

je x Calyx x) 

Let Z be a set such that every non-empty set of subsets of Z has a mini- 
mal element. Assume Inf(Z). Let Y be the set of all infinite subsets of Z. 
Then Y is anon-empty set of subsets of Z without a minimal element. 
Conversely, prove by induction that, for all « in @, any non-empty sub- 
set of .7(a) has a minimal element. The result then carries over to non- 
empty subsets of ./(z), where z is any finite set. 
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4.64 


4.65 


4.68 


4.69 


4.70 


4.71 


a. Induction on a in (Vx\(x = aA «Ew A Den(y) > Den(x U y)). 
Induction on «in (V x)\(x Za A x # @ A Den(y)>Den(x x y)) 
Assume z € x and Den(z). Let z=. Define a function g on x as fol- 
lows: gu =uif uEex—z; g'u =( FY'(( fa)! if u € z. Assume x is 
Dedekind-infinite. Assume z C x and x=z. Let v € x — z. Define a 


function h on @ such that h’@ = 0 and h'(&’) = f(I'a) if « Eo. Then h 
is one-one. So, Den (ho) and h’o € x. 


f. Assume y ¢ x. (i) Assume x U {y} & x. Define by induction a function g 
on @ such that g’@ = y and g‘(n + 1) = f(g'n). g is a one-one func- 
tion from @ into x. Hence, x contains a denumerable subset and, by 
part (©), x is Dedekind-infinite. (ii) Assume x is Dedekind-infinite. 
Then, by part (c), there is a denumerable subset z of x. Assume 
Z=. Let cy = (f-')'@. Define a function F as follows: F’u = u for 
u é x -— 2Z; Fey = y; Fu = (f-)(fu-l) for u € z — {co}. Then xEX Ufy}. 
If z is {Cq, Cy, Cy, ...}, F takes c;,, into c; and moves Cy into y. 

g. Assume @ < x. By part (0), x is Dedekind-infinite. Choose y ¢ x. By 
part (f), x =x U {y}. Hence, x +, 1 = (x x {O}) U (@, I} = xu {y} = x. 

Assume M is a model of NBG with denumerable domain D. Let d be 

the element of D satisfying the wf x = 2°. Hence, d satisfies the wf (x & «@). 

This means that there is no object in D that satisfies the condition of 

being a one-one correspondence between d and @. Since D is denu- 

merable, there is a one-one correspondence between the set of “ele- 
ments” of d (that is, the set of objects c in D such that Fy c € d) and the 
set of natural numbers. However, no such one-one correspondence 

exists within M. 


NBG is finitely axiomatizable and has only the binary predicate letter 
Aj. The argument on pages 273-274 shows that NBG is recursively 
undecidable. Hence, by Proposition 3.49, the predicate calculus with 
Aj as its only non-logical constant is recursively undecidable. 


a. Assume x < a,. If 2 < x, then, by Propositions 4.37(b) and 4.40, 
Oy XX UM, XX X @ X @, X @, & @,. If x contains one element, use 
Exercise 4.64(c, f). 


b. Use Corollary 4.41. 

a. P(Wy) X A(Wy) & 20% x 200 & 2a toa YX Qa x Aw,) 

b. (#(0,)) = (2) 22%" 22 = 2(a,) 

a. Ifywerenon-empty and finite, y = y +, y would contradict Exercise 
4.62(b). 

b. Bypart(o), let y=uUv,unv=G,uzy,v=y. Let y=v. Define a func- 


tion g on .y) as follows: for x C y, let g’x =u U (f’x). Then g’x € y and 
y =ux<g'x <y. Hence, g'x = y. So, g is a one-one function from ./(y) 
into A = {z|z Cy Az& y}. Thus, .7(y) <A. Since A C .7(y), A X.Y). 
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4.73 
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Use part (d): {zz CyAz2y} C {z|z Cy A Inf(2)}. 

Bee ey Jae anc pa eee epee: 

on y as follows: fx = h'x if x € u and f'x = (h-l)'x if x € v. 

Use Proposition 4.37(b). 

i. Perm(y)cy’ x (2")" =2YY =2V = y(y). 

ii. By part (a), we may use Exercise 4.7 (c). Let y= u Uv, uNv= 
O@,uzy,v=y. Let u=v and y= =U. Define a function F: Ay) > 
Perm(y) in the following way: assume z € Ay). Let y,: y > y be 
defined as follows: w,’x = H'x if x € G"z; w,’x = (H-'x if (H-"x € 
G"z; w,’x = x otherwise. Then wy, € Perm(y). Let F’z = w,. F is 
one-one. Hence, Ay) < Perm(y). 


Use WO and Proposition 4.19. 


The proof of Zorn > WO in Proposition 4.42 uses only this special 
case of Zorn’s Lemma. 


To prove the Hausdorff maximal principal (HMP) from Zorn, 
consider some Gchain C, in x. Let y be the set of all Gchains C 
in x such that Cy € C and apply part (b) to y. Conversely, assume 
HMP. To prove part (b), assume that the union of each nonempty 
c-chain ina given non-empty set x is also in x. By HMP applied to 
the c-chain @, there is some maximal C -chain C in x. Then UJ (C) 
is an C-maximal element of x. 


Assume the Teichmiiller-Tukey lemma (TT). To prove part (b), 
assume that the union of each non-empty C-chain in a given non- 
empty set x is also in x. Let y be the set of all C-chains in x. y is 
easily seen to be a set of finite character. Therefore, y contains a 
Gmaximal element C. Then U (C) is a C-maximal element of x. 
Conversely, let x be any set of finite character. In order to prove TT 
by means of part (b), we must show that, if Cis a C-chainin x, then 
U(C) € x. By the finite character of x, it suffices to show that every 
finite subset z of U (C) is in x. Now, since z is finite, z is a subset of 
the union of a finite subset W of C. Since C is a C-chain, W has a 
Cgreatest element w € x, and z is a subset of w. Since x is of finite 
character, z € x. 


Assume Rel(x). Let u = {z|(Gv)(v € A(x) A z = {v} lx}; that is, z € 
u if z is the set of all ordered pairs (v, w) in x, for some fixed v. 
Apply the multiplicative axiom to u. The resulting choice set 
y © x is a function with domain 7(x). Conversely, the given 
property easily yields the multiplicative axiom. If x is a set of 
disjoint non-empty sets, let r be the set of all ordered pairs (u, v) 
such that u € x and v € u. Hence, there is a function f € r such 
that 7(f) = 7(r) = x. The range /(f) is the required choice set 
for x. 
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4.76 


4.79 


4.81 


By trichotomy, either x < y or y x x. If x < y, there is a function with 
domain y and range x. (Assume x=y; C y.) Take c € x. Define g’u=c 


ifuey—y,and g’u=(f'u if u € y,) Similarly, if y < x, there is 
a function with domain x and range y. Conversely, to prove WO, 
apply the assumption (f) tox and _7’((x)). Note that, if f)(fu0 A 
Af) = v), then .7(v) <7). Therefore, if there were a function f from 
xonto 7'(7(x)), we would have 7 (y (x)) <P ( H (y (x))) ~< (x) 
contradicting the definition of 7’(/(x)). Hence, there is a function 
from 7‘(.7(x)) onto x. Since 7'(.7(x)) is an ordinal, one can define 
a one-one function from x into 7‘(.7(x)). Thus x < 7'(.7(x)) and, 
therefore, x can be well-ordered. 


If < is a partial ordering of x, use Zorn’s lemma to obtain a maximal 
partial ordering <* of x with < € <*. But a maximal partial ordering 
must be a total ordering. (If u, v were distinct elements of x unrelated 
by <*, we could add to <* all pairs (uw, v,) such that u, <* u and v <* 0}. 
The new relation would be a partial ordering properly containing <*) 


b. 


Sincexxy2xt+y,xxy=aUbwithanb=@,a=x,b=y. Letrbe 
a well-ordering of y. (i) Assume there exists u in x such that (u, 0) 
€ a for all v in y. Then y < a. Since a = x, y X x, contradicting -(y < x). 
Hence, (ii) for any u in x, there exists v in y such that (u, v) € b. 
Define f: x > b such that f'u = (u, v), where v is the r-least element 
of y such that (u, v) € b. Since fis one-one, x < b & y. 

Clearly Inf(z) and Inf(x +, z). Then x +,.z & (x +,.z)? & x7 +,2 x 
(xx Z)4+.272x4+,.2x (x x Z4+,2z 

Therefore, x x ZX 2x (Xx zZ)<X4,2x (x xz) +,z22%X+,z. Conversely, 
X+.Z <x x z by Proposition 4.37(b). 


If AC holds, (Vy)(Inf(y) > y & y x y) follows from Proposition 4.40 
and Exercise 4.73(a). Conversely, if we assume y & y x y for all 
infinite y, then, by parts (c) and (b), it follows that x < 7’ x for any 
infinite set x. Since 7’x is an ordinal, x can be well-ordered. Thus, 
WO holds. 


Let ( be a well-ordering of the range of r. Let f’@ be the (-least ele- 
ment of .7 (r), and let f'n be the (least element of those v in 7 (7) 
such that ( f’1, v) €r. 

Assume Den(x) A (V uu € x > u # @). Let w= x. Let r be the set of 
all pairs (a, b) such that a and b are finite sequences (oy: Vapneg On) 
and (Uo, 04, -.-, U;, 41) such that, for0O <i<n+1,v;€ g'i. Since A(r) C 
7 (”), PDC produces a function h: > 7 (r) such that (h’n, h'(n’)) €r 
for all n in @. Define the choice function f by taking, for each u in 
x, f'u to be the (g’u) th component of the sequence h’ (g’u). 


Assume PDC and Inf(x). Let r consist of all ordered pairs (u, u U {a}), 
where u U {a} C x,Fin(u U {a}), and a ¢ u. By PDC, there is a function 
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fi: @ > Ar) such that (f'nf(n')) € r for all n in . Define g: o > x 
by setting g’n equal to the unique element of f'(n’) — f'n. Then g is 
one-one, and so, @ < x. 


d. In the proof of Proposition 4.44(b), instead of using the choice 
function h, apply PDC to obtain the function f. As the relation 1, 
use the set of all pairs (u, v) such that uE€c,vEc,vEunXx. 


a. Use transfinite induction. 

d. Use induction on fp. 

(e)-(f) Use transfinite induction and part (a). 

h. Assume u C H. Let v be the set of ranks p’x of elements x in u. Let 
Bp=Uv. Thenu CW’ B. Hence u € .7(¥'B) = (8) C H. 

Assume X # @A 7— (Ay\(y € X AYN X = @). Choose u € X. Define a 

function g such that g’ @=un X, g'(n’) =U (g'n) Nn X. Let x =U (A(Qg)). 

Then x #@ and (Vy\(yex>ynxF#®). 

Hint: Assume that the other axioms of NBG are consistent and that the 

Axiom of Infinity is provable from them. Show that H,, is a model for 

the other axioms but not for the Axiom of Infinity. 

Use Hos. 


a. LetC={x|-Gy\x eyay ex}. 


Chapter 5 


5.1 


5.2 
5.7 


5.8 


qo|Bqo 

qoBRq; 

11140 

qiBRq, 

a. U3 b. &(x) 

Let a Turing machine 7 compute the function f Replace all occur- 
rences of qg in the quadruples of .7 by a new internal state q,. Then 
add the quadruples qp 4; a; q, for all symbols a; of the alphabet of .< 
The Turing machine defined by the enlarged set of quadruples also 
computes the function f. 

p finds the first non-blank square to the right of the initially scanned 
square and then stops; if there is no such square, it keeps moving to 
the right forever. ’s behavior is similar to that of p, except that it moves 
to the left. 


a.N(QX) =x+1 b.f(%)=1forallx c.2x 
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5.12 


5.14 
5.16 


5.20 


a. 


1 1 
P(Ky)? lagh ~—> Zrar —~ BK 
re 
Cc PKL 


a. The empty function b.N(w)=x+4+1 c. Z(x) 
If f(a,) = b,, ..., f(a,) = b,, then 


f(x) =py (x= may a=b)v- v(x an ay =by) | 


Let g(z, x) = U(wyLZ, x, y)) and use Corollary 5.11. Let v, be a number 
such that g(x, x) + 1 = g(vo, x). Then, if g(vp, Vo) is defined, gv, Vo) +1 = 
GW, V9), Which is impossible. 


Risa Nig SIG aie Xn)88(Cr (ics Xp) )+ ++ 


Ay (x1, Weag Xn)°88 (Cr (1, es ef Xn)) 


a. Assume that h(x) is a recursive function such that h(x) = pyT,(x, x, y) 
for every x in the domain of pyI(x, x, y). Then (Ay)T,(%, x, y) if and 
only if T,(x, x, h(x). Since T(x, x, h(x)) is a recursive relation, this 
contradicts Corollary 5.13(a). 

b. Use Exercise 5.21. 


Z(uyT,(x, x, y)) is recursively completable, but its domain is {x|(4y) 
T,(x, x, y)}, which, by Corollary 5.13(a), is not recursive. 


Let bea Turing machine with a recursively unsolvable halting prob- 
lem. Let a, be a symbol not in the alphabet of _~ Let q, be an internal 
state symbol that does not occur in the quadruples of ~ For each q; 
of 7 and a; of ., if no quadruple of 7 begins with q; a;, then add the 
quadruple q; a; a, q,- Call the new Turing machine T*. Then, for any 
initial tape description « of , 7“, begun on aq, prints a, if and only if 
7 is applicable to «. Hence, if the printing problem for ./* and a, were 
recursively solvable, then the halting problem for 7 would be recur- 
sively solvable. 


Let 7 be a Turing machine with a recursively unsolvable halting prob- 
lem. For any initial tape description « for , construct a Turing machine 
To that does the following: for any initial tape description B, start _7on 
a; if 7 stops, erase the result and then start 7 on f. It is easy to check 
that “is applicable to « if and only if .4 has a recursively unsolvable 
halting problem. It is very tedious to show how to construct 4 and to 
prove that the Gédel number of 4 is a recursive function of the Gédel 
number of «. 
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5.33 Let v, be the index of a partial recursive function G(x) with non-empty 
domain. If the given decision problem were recursively solvable, so 
would be the decision problem of Example 1 on page 340. 


5.34 By Corollary 5.16, there is a recursive function g(u) such that 
Pew) (xX) =x-nyTi (u,u,y). Then 9j,, has an empty domain if and 
only if 7(4y)T\u, u, y). But, >(4y)T\(u, u, y) is not recursive by 
Corollary 5.13(a). 


5.39 a. 


5.42 a. 


By Corollary 5.16, there is a recursive function g(u) such that 
Pe(u) (x)= Hy (x =uAy =x). The domain of ey, is {u}. Apply the 
fixed-point theorem to g. 

There isa recursive function g(u) such that @ 9, (x)=Hy (x#uay = 0). 
Apply the fixed-point theorem to g. 


Let A = {x| f(x) € B}. By Proposition 5.21(c), B is the domain of a 
partial recursive function g. Then A is the domain of the composi- 
tion g of. Since g o fis partial recursive by substitution, A is r.e. by 
Proposition 5.21(c). 

Let B be a recursive set and let D be the inverse image of B under 
a recursive function f. Then x € D if and only if C,(f(x)) = 0, and 
Cz(f(x)) = 0 is a recursive relation. 


Let B be an re. set and let A be the image {f(x)|x € B} under a partial 
recursive function f. If B is empty, so is A. If B is nonempty, then B is 
the range of a recursive function g. Then A is the range of the partial 
recursive function f(g(x)) and, by Proposition 5.21(b), A is re. 


Consider part (b). Given any natural number x, compute the value 
f(x) and determine whether f(x) is in B. This is an effective pro- 
cedure for determining membership in the inverse image of B. 
Hence, by Church’s thesis, B is recursive. 


Any non-empty re. set that is not recursive (such as that of 
Proposition 5.21(e)) is the range of a recursive function g and is, 
therefore, the image of the recursive set of all natural numbers 
under the function g. 


5.43 The proof has two parts: 


1. 


Let A be an infinite recursive set. Let g(@) = px(x EA A (Vf) <i (X # Q()))- 
Then g@) = h@, g#@), where h@, u) = px(x € A A (WY); <; & # (U))). 
h is recursive, and g is recursive by Preposition 3.20. g is strictly 
increasing and its range is A. (This proof is due to Gordon 
McLean, Jr.) 


Let A be the range of a strictly increasing recursive function g. 
Then g(x) => x for all x (by the special case of Proposition 4.15). 
Hence, x € A if and only if (4u),<,g(u) = x. So, A is recursive by 
Proposition 3.18. 
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5.44 


5.46 


5.47 


5.48 


5.49 


Assume A is an infinite re. set. Let A be the range of the recursive 
function g(x). Define the function f by the following course-of-values 
recursion: 


F(n)=o(ny ((v2).., 9(¥) # F(2)))=9(wy((v2).., 9(y) #(F #0)),)) 


Then A is the range of f, fis one-one, and fis recursive by Propositions 
3.18 and 3.20. Intuitively, f(0) = g(0) and, for n > 0, f(n) = g(y), where y is 
the least number for which g(y) is different from f(0), f(D), .... fa — 1). 


Let A be an infinite r.e. set, and let A be the range of the recursive 
function g. Since A is infinite, F(u) = py(g(y) > u) is a recursive func- 
tion. Define G(0) = g(0), Gin + 1) = g(py(g(y) > G(n))) = g(F(G(n)). G is 
a strictly increasing recursive function whose range is infinite and 
included in A. By Exercise 5.43, the range of G is an infinite recursive 
subset of A. 


a. By Corollary 5.16, there is a recursive function g(u, v) such that 
P4(u,0) (x) = by (T; (u, x, y) v T; (v, x, y)). 


Assume (V). Let f(x,, ..., x,) be effectively computable. Then the set 
B= {ulf(w)y ... @,) = Maul is effectively enumerable and, there- 
fore, by (V), re. Hence, u € B © (Ay)R(u, y) for some recursive rela- 
tion R. Then 


tits a= ([H2((o), =4%1A...A((O)o), = Xn AR((oo-())) |] 


n+1 


So, f is recursive. Conversely, assume Church’s thesis and let W be an 
effectively enumerable set. If W is empty, then W is re. If W is non- 
empty, let W be the range of the effectively computable function g. By 
Church’s thesis, g is recursive. But, x € W © (4u)(g(u) = x). Hence, W is 
r.e. by Proposition 5.21(a). 


Assume A is re. Since A # @, A is the range of a recursive function g(z). 
So, for each z, U(pyT;(g(2), x, y)) is total and, therefore, recursive. Hence, 
U(uyT;(g(), x, y)) + 1 is recursive. Then there must be a number Z,) such 
that U(pyT,(g@), x, y)) + 1 is recursive. Then there must be a number 
Z such that U(uyT; (g(x), x, y)) + 1 = U(uyT;(go), x, y)). A contradiction 
results when x = Zo. 


(a) Let ev) = n for all n. 
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Let @(z)=o7 (ny [Ti (z,07(y), 03(y)) Aoi(y) > 2z]]), and let B be the 
range of @. 


b. 


» 


Let A be re. Then x € A = (Ay)R(x, y), where R is recursive. Let (x, y) 
express R(x, y) in K. Then k ¢ A & x (3y) 2(k,y). 


Assume ke A Sty ov k ) for all natural numbers k. Thenk € A & 
(Ay)B_(k, y) and B _ is recursive (see the proof of Proposition 3.29 on 
page 201. 


Clearly T, is infinite. Let f(x) be a recursive function with range 
Ty. Let “4, .4, ... be the theorems of K, where 4 is the wf of K 
with Gédel number f (j). Let g(x, y) be the recursive function such 
that, if x is the Gddel number of a wf 7, then g(x, /) is the Gédel 
number of the conjunction 7A 7A ... A 7consisting of j conjuncts; 
and, otherwise, g(x, j)=0. Then g(f(j), j) is the Godel number of 
the j-fold conjunction .4, A .4A ... A.4. Let K’ be the theory whose 
axioms are all these j-fold conjunctions, for j = 0, 1, 2, ... Then K’ 
and K have the same theorems. Moreover, the set of axioms of K’ 
is recursive. In fact, x is the G6del number of an axiom of K’ if 
and only if x #0 A (Ay),<.(9(f(y), y) = x). From an intuitive stand- 
point using Church’s thesis, we observe that, given any wf A, one 
can decide whether A is a conjunction “A 7A... A 7; if it is such 
a conjunction, one can determine the number j of conjuncts and 


check whether + is 4. 
Part (b) follows from part (a). 


Assume .4(x;) weakly expresses (Tx)* in K. Then, for any 1, 
tx 4 (7) if and only if n €(Tx)*. Let p be the Gédel number of 
Ax,). Then tx .4 (p) if and only if p €(Tx)*. Hence, Fy .(p) if and 
only if the Gédel number of .“(p) is in Tx; that is, Fx .(p) if and only 
if not-Fy A(p). 

If K is recursively decidable, T, is recursive. Hence, Tx is recursive 
and, by Exercise 5.57, (Tx)* is recursive. So, (Tx)* is weakly express- 
ible in K, contradicting part (a). 


Use part (b); every recursive set is expressible, and, therefore, 
weakly expressible, in every consistent extension of K. 


i. 8x). 
ii, X—X2 
iii. The function with empty domain. 
iv. The doubling function. 
i. fP(%1,0) = XxX 
f° (0, x2) =X2 
ft ((x1)',(%2)') = fP(X1,%2) 
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ao fF Pp 


ii. fe (%1,0) =X} 
fi (xis(2)') = (f? G1, 22)) 
f2(x1,0) =0 
FE (1,(%2)') = fr (fr (1,2), 1) 


iii, fi(O)=1 
fr((ay) =0 
f2(0)=0 


fi (ay) = ft (200) 


Any word P is transformed into QP. 
Any word P in A is transformed into PQ. 
Any word P in A is transformed into Q. 


Any word P in A is transformed into n, where n is the number of 
symbols in P. 


a&—>-A(Ein A) 
aro-A 

A> a 

a& > &a(in A) 
Ea>-A(Ein A) 
av-:A 

A>a@ 

&> A(Gin A) 
aar-A 

A>-o 

nb >nBé nin A) 
a§€—>&B Ea in A) 
p> y¥ 

yoda 

arv-A 


A-a@ 


aa; > Q,a7=1,...,4) 
a&€>EaEinA — {a,,...,4,)) 
arv-A 


A-a@ 
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5.64 d. | BI>B 
Bo| 
e. |B|= | 
f. Let a, B and 6 be new symbols. 
BIB 
a.|>|Bo 
a>A 
\|5 >| da 
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